You are on page 1of 252

QUANTITATIVE MANAGEMENT

Contents

Block 1 INTRODUCTION TO QUANTITATIVE MANAGEMENT AND STATISTICAL


METHODS

Unit 1 INTRODUCTION TO QUANTITATIVE METHODS

Introduction, Meaning of Quantitative Methods, Classification of Quantitative Methods,


Classification of Statistical Methods, Models in Operations Research, Various Statistical
Methods, Operation Research Tools and Techniques, Importance of Quantitative
Methods, Application of Quantitative Methods
Unit 2 MEASURES OF CENTRAL TENDENCY

Introduction, Measures of Central Tendency, Arithmetic mean, Median, Mode, Quartile,


Comparative Analysis between Mean, Median and Mode
Unit 3 DISCRETE PROBABILITY DISTRIBUTION

Introduction, Random variable and Probability Distribution, Discrete Probability


Distribution, Binomial Distribution, Poisson distribution
Unit 4 CONTINUOUS PROBABILITY DISTRIBUTION

Introduction, Continuous Probability Distribution, Uniform Distribution, Normal


Distribution, Exponential Distribution
_________________________________________________________________________

Block 2 DECISION MAKING AND FORECASTING METHODS


Unit 1 DECISION THEORY
Introduction, Types of decision-making environments, Key problems in decision theory,
Steps of decision-making process, Decisions under uncertainty, Risk and Certainty, One-
stage and Multi-stage decision making, use of probabilities to make decisions, Decision
Tree
Unit 2 CORRELATION AND REGRESSION ANALYSIS
Introduction to Correlation, Pearson product-moment correlation co-efficient,
Introduction to Regression analysis, Simple regression analysis, Residual analysis,
Standard error of estimate, Co-efficient of determination
Unit 3 FORECASTING
Introduction, General Steps of Forecasting Techniques, Types of Forecast Models, Time-
Series Analysis – Components of Time-Series Analysis, Moving Average, Exponential
Smoothing, Measures Forecast Accuracy, Least Square Regression Analysis, Application
Areas of Forecasting
________________________________________________________________________
Block 3 LINEAR PROGRAMMING PROBLEM AND SPECIAL PROBLEMS

Unit 1 LINEAR PROGRAMMING FORMULATION AND GRAPHICAL METHOD


Introduction to Linear Programming Problems (LPP), Characteristics of LPP, Linear
Programming Formulation, Solution method – Graphical Solution method (Only for two
Decision Variables), Slack and Surplus, Types of Constraints, Special Cases of LPP,
Applications of LPP in Business

Unit 2 LPP-SIMPLEX METHOD


Introduction, Basic format of simplex method, Principles of simplex method, Steps of
Simplex Method, Linear Programming –Solution method – Simplex Method (>= 2
Decision Variables),

Unit 3 TRANSPORTATION
Introduction, Basic structure of transportation, Transportation problem- Initial Basic
feasible solution (North west corner rule, Least Cost Rule, Vogel’s approximation
method), Test for optimality (The Modified Distribution (MODI) method), Special cases
of transportation

Unit 4 ASSIGNMENT
Introduction, Basic structure of assignment, Approach of the Assignment model, Solution
Method (Hungarian method), Special cases of Assignment
_______________________________________________________________

Block 4 SPECIFIC OPERATION RESEARCH METHODS


Unit 1 PROJECT SCHEDULING-PERT/CPM

Introduction, PERT/CPM Network, Project scheduling with Certain Activity Times,


Project Scheduling with Uncertain Activity Times
Unit 2 WAITING LINE MODELS

Introduction, Waiting Line System, Operating Characteristics of Waiting Line System,


Waiting Line Models, Single Channel Poisson Arrivals with Exponential Service
Times(M/M/1), Multiple Channel Poisson Arrivals with Exponential Service
Times(M/M/C), Single Channel Poisson Arrivals with Arbitrary Service Times(M/G/1),
Unit 3 GAME THEORY

Introduction, Basic Concepts in Game theory, Two- person zero-sum game, Game with
no Saddle Point, Principle of Dominance, Solution of 2Xn and m X 2 games,
Block no.1 Introduction to Quantitative Management
and Statistical methods
_________________________________
Block Introduction
In this block, an introduction to quantitative methods will be given. The basic difference between
statistics and operations research will be discussed. The role and importance and its application
of quantitative methods in business will be explained. In the second unit, the meaning and
importance of measures of central tendency will be discussed. Various measures of central
tendency and its comparative analysis will be covered. In the third unit discrete probability
distribution and its various types will be discussed. In the last unit continuous probability
distributions and its various applications will be covered.

Block Objective
• Understand the meaning of quantitative methods

• Appreciate the difference between statistics and operations research

• Explain the role and importance of quantitative methods

• Explain various techniques of quantitative methods

• Understand applications of quantitative methods in Business


• Understand the meaning of Central tendency
• Understand the Importance of measures of centrals tendency
• Compute various measures of central tendency- arithmetic mean, weighted mean,
geometric mean, harmonic mean, median and mode
• Explain the relationship between mean, median and mode

• Understand the importance of probability distributions in decision making

• Explain random variable and its types

• Identify the various situations where discrete probability distributions can be applied.

• Understand Binomial distribution and its uses

• Explain Poisson distribution and its uses

• Understand the importance of continuous probability distributions in decision making

• Identify the situations where continuous probability distributions can be applied

• Explain Uniform distribution and its application


• Understand Normal distribution and its application

• Explain Exponential distribution and its application

Block Structure

Unit 1 Introduction to Quantitative Methods


Unit 2 Measures of Central Tendency
Unit 3 Discrete Probability Distribution
Unit 4 Continuous Probability Distribution
Unit 1 : INTRODUCTION TO QUANTITATIVE METHODS
_________________________________
Unit Structure
1.0Learning Objectives

1.1 Introduction

1.2 Meaning of Quantitative Methods

1.3 Classification of Quantitative Methods

1.3.1 Statistical Methods

1.3.2 Operation Research

1.4 Classification of Statistical Methods

1.5 Models in Operations Research

1.6 Various Statistical Methods

1.7 Operation Research Tools and Techniques

1.8 Importance of Quantitative methods

1.8.1 Advantages of Statistics in Business

1.8.2 Advantages of Operation Research in Business

1.9 Application of Quantitative Methods

1.9.1Application of Statistics in Business

1.9.2 Application of Operation Research in Business

1.10 Let’s Sum Up

1.11 Answers to Check your Progress

1.12 Glossary

1.13 Assignment

1.14 Activities
1.15 Case Study

1.16 Further Readings


1.0 Learning Objectives
After learning this unit, you will be able to understand:

• Understand the meaning of quantitative methods

• Appreciate the difference between statistics and operations research

• Explain the role and importance of quantitative methods

• Explain various techniques of quantitative methods

• Understand applications of quantitative methods in Business

1.1 Introduction
Decision making is an integral part of management of an organization. Every day business
managers are required to make decisions. The key managerial functions of planning, organizing
directing and controlling, requires management to be engaged continuously in the process of
decision making pertaining to each of them. So we can say that management can be regarded an
equivalent to decision making.
Historically, decision making was considered purely as an art, acquired over period of time based
on experience. Various styles of decision making were observed in solving similar managerial
problems by different people in real business situations. Many times managers resort to their
“instincts” to make decisions (unstructured decision making). However, the environment in
which the management has to operate these days is complex and fast changing. There is a great
requirement for augmenting the art of decision making using systematic and scientific methods.
Most decisions cannot be taken on the basis of ‘rule of thumb’ or common sense or snap
judgment. For businesses, a single wrong decision may have long term painful implications. The
present day managers cannot work on trial and error method. A systematic approach to decision
making is also necessary, as the cost of making errors may be too high and at times irreversible.
Thus the managers in the business world should understand the importance of scientific
methodology of decision making. It means defining the problem in a clear manner, collecting
required data, analyzing the data thoroughly, deriving and forming conclusions about the data
and finally implementing the solution.
Although qualitative approach are inherent in the manager and usually increase with experience,
the skills of the quantitative approach need to be learned by studying its assumptions and
methods. A manager who is knowledgeable in Quantitative methods can compare and evaluate
the qualitative and quantitative sources of recommendations and finally combine the two sources
to choose the best possible decision.

1.2 Meaning of Quantitative methods


Quantitative methods can be understood as a collection of statistical and operation research
(management science) techniques that are used to provide powerful means of analysis using
quantitative data for effective decision making in business. These techniques involve systematic
and scientific approach for solving complex business problems.
Quantitative methods involve the use of numbers, symbols, mathematical expressions and other
elements of quantities. These are used to supplement the judgment and intuitions of the decision
makers. The essential idea of the quantitative approach to decision making is that if the factors
that influence the decisions can be identified and quantified, it becomes easier to resolve the
complexity of the problem at hand. These methods help businesses in optimum utilization of
resources with limited resources. In other words, we can say quantitative methods helps in
choosing the best course of action from the alternative courses of actions available to achieve the
optimum value of the objective or goal.

1.3 Classification of Quantitative Methods


There are many various types of quantitative methods that are used as a tool of decision making
in business. These methods are broadly categorized as Statistical Techniques and Operation
Research techniques.

i) Statistical Methods

ii) Operations Research (programming or Management Science) methods

1.3.1Statistical methods
Statistics is a science dealing with the collection, analyze, interpretation and presentation of
numerical data. As an example, let us suppose that a company is interested in knowing the
satisfaction level its consumers.The first step will be data collection on satisfaction level, the
factors of satisfaction and other variables related to consumer behavior.The data so obtained can
be organised on the basis of various demographic and classification variables like- age, income,
gender, education level, region etc.Thisorganised data may now be presented by means of
tabular data or various types of graphs to facilitate analysis. The average satisfaction level can be
derived and further compared on the basis of measured variables like age.This information will
help to determine if a particular age group is more satisfied as compared to others. Similarly
various kinds of analysis will give insights to drawing conclusions about the population being
studied. This will further help in decision making related to improvement of satisfaction level of
customers of the targeted product.

Classification of Statistical data


The data used in statistical study is broadly classified into two types- (1) Primary data (2)
secondary data. When the data used in the study is collected specifically for the purpose of the
study, such type of data is referred to as primary data. Primary data is collected afresh for the
first time and thus have originality in its character. On the other hand, when the data was
collected for some other purpose and is derived from the other sources then such data is referred
to as secondary data. The secondary data is collected by some organization and are available in
published form and is used by someone else for their research.

The same data when can be called as primary or secondary, based on the difference of who is
using it. For example a researcher wants to study the economic conditions of laborers’ in India.
If the researcher collects the data directly using a questionnaire, it is called ‘primary data’.
However if some other researcher uses this data for some other purpose subsequently, then the
same data becomes “secondary data”.

Whenever one is doing research, first it must be checked whether any secondary data is available
on the subject matter of interest which can be used, as it will save a lot of time and money.
However the data must be verified thoroughly for its reliability and accuracy. Its relevance and
the context under which it is collected should also be verified, since it was originally collected
for another purpose. The researcher would need to collect original data according to his
objectives, when either secondary data is not available or is not reliable.

There are many international bodies who collect great amount of data regularly and publish like:
International Monetary fund(IMF), World Health organization(WHO, Asian Development Bank,
International LabourOrganisation , United nations organization , world meterological
organization, Food and agriculture organization(FAO),etc., Government and its many agencies:
Reserve Bank of India, Census Commission, Ministries-Ministry of Economic Affairs,
Commerce Ministry; Private Research Organisations, Trade Associations, etc. Examples of
government publications in India are reports on currency and finance, India trade journal,
statistical abstract of India, Indian customs and central excise tariff, reservebank of India
bulletin, agricultural statistics of India, economicsurvey, and Indian foreign statistics, etc.

1.3.2 Operations Research


It is a method of employing mathematical representations or models to analyse business
problems to take management decisions. This dominant characteristic of mathematical
representation or model building gives a distinctive approach to operations research from
statistics. The scientific method translates a real given problem into a mathematical form , which
is solved and re-transformed into the original context. The OR approach consists of the following
steps- (1) Formulate the problem (2) Develop a Model (3) Obtain the input data (4) Solve the
model (5) Validate the Model (6) Implement the solution.

1. Problem Formulation. The first step in operations research is to develop a clear and
concise statement of problem. It is essential to identify and understand the root problem
to get the right answer to solve the problem. The symptom should not be confused with
the problem. For example higher production cost is a symptom, where the underlying
problem may be of – improper inventory levels, excessive wastage, poor quality control,
etc. The symptoms are only an indication of the problem and hence the manager should
go beyond the symptoms to identify the real cause of the problem. Also there may be
multiple problems and one may be related to other. The organization often selects those
problems whose solution would either result in increasing profit or decreasing cost. So it
is imperative for an analyst to have an extensive interaction with the management
involving selection and interpretation of the available data This step often involves
various activities like- site visit, meetings, research, conferences, observations etc, These
activities which provide the analyst with the required information to formulate the right
problem.
2. Model Building: Once a problem is identified, the next step is to develop a model. A
model is a representation of some abstract or real life problem. The models are basically
mathematical models, which describes systems, process in the form of equations,
formula/relationships. The activities in this step involve defining the variable, studying
their relationship and formulating equations to represent the problem. The model will be
tested in different environmental constraints and revised in order to work.
3. Obtaining the input data: The next step is to obtain the data to be used in the model as
input. The data should be accurate, relevant and complete in all respect. The quality of
the input data will decide the quality of output. A number of resources including
company reports and documents, interviews with company employees may be used for
data collection.
4. Solution of the Model: The next stage of analysis is finding the solution nd interpreting
it in the context of the problem. A solution to a model means determination of a specific
set of decision variables that would give a desired level of output. The desired level of
output is the level which ‘optimises’. Optimisation means maximization of the goal
attainment from a given set of resources or minimsation of cost as will satisfy the
required level of goal attainment.
5. Model Validation: The validation of the model means whether the developed model is
adequately predicting the behavior of the actual system, it is representing. It involves
checking the reliability and ascertaining if the structural assumptions of the model are
met. It’s a normal practice is to test the validity by comparing the performance of the
past data available with respect to the actual system
6. Implementation: The final step is the implementation of the results. It is the process of
incorporating the developed model as a solution in the organization. The techniques and
methods of operation research are based on mathematical concepts, and neglect the
human aspects, which are most important at the time of implementation. The impact of
the decision will be influenced by the level of motivation, resistance to change, desire to
be informed among employees. It will be very important to tactfully handle these issues
for successful implementation of the solution. A model which gives average theoretical
advantage but implementable is better than one which ranks high on theoretical
advantage but cannot be implemented.
Check your Progress1
1. Individual respondents, focus groups, and panels of respondents are categorised as
a) Primary Data Sources
b) Secondary Data Sources
c) Itemized Data Sources
d) Pointed Data Sources

2. The method of employing mathematical representations or models to analyse


business problems to take management decisions
a) Operations Research
b) Statistical methods
c) Economics
d) Mathematics

3. The first stage of statistics is


a) Analysing
b) Collection of data
c) Presentation
d) Interpretation
1.4 Classification of Statistical methods
The statistical methods can be classified into basically two groups- Descriptive and Inferential
Statistics
Descriptive Statistics: Data gathered on a group to describe or reach conclusions about that
same group, are called descriptive statistics. Suppose a professor computes an average grade for
one english class and uses statistics to describe the performance of that one class, it is called
descriptive statistics. The descriptive statistics include the various methods of collection and
presentation of data, measures of central tendency, dispersion, shape and index numbers etc.

Inferential Statistics: If a research gathers data from a sample and uses the statistics generated
to reach conclusions about the population from which the sample was taken, the statistics are
inferential statistics. The data gathered are used to infer something about a large group.
Continuing with the same example if the professor uses statistics on average grade achieved by
one class to estimate the average grade achieved by all five sections of the same english course.
The process of estimating this average grade would be called as inferential statistics. Inferential
ststistics are sometimes also referred to as inductive statistics. We need to understand word
‘statistic’ and parameter’ to understand inferential statistics. A statistic is a descriptive measure
computed from a sample of data.
• A statistic is a descriptive measure computed from a sample of data. For eg. mean ( x
) and standard deviation (s) of a sample are known as ‘Statistic’.
• A parameter is a descriptive measure computed from an entire population of
data.Foreg. mean (µ) and standard Deviation () of a population are known as
‘Parameter’.

Check your Progress 2


1. Graphical and numerical methods are specialized process utilized in
a) Education Statistics
b) Descriptive Statistics
c) Business Statistics
d) Social Statistics
2. A numerical value used as a summary measure for a sample, such as a sample mean, is
known as a
a) Population Parameter
b) Sample Parameter
c) Sample Statistic
d) Population Mean

1.5 Models in Operations research


As mentioned earlier, the concepts of model building lie at the heart of operations research
approach to problem solving. A model is a theoretical abstraction of a real life problem. As
many real life problems are complex and may involve many factors, the decision maker has to
choose those factors which are relevant to the problem. After selecting the critical factors, they
are combined in a logical manner to form a model of the actual problem. There are three type of
OR models- (i) Iconic Models (ii) Analogue Models (iii) Symbolic Models
i. Iconic Models

Iconic models represent a system the way it is, but in different size. They are essentially the
scaled up/down versions of the particular thing they represent. It is obtained by reducing or
enlarging the size of the system. In other words they are images. A model of a proposed building
by an architect, model of solar system, model of molecular structure of a chemical, a toy
aeroplane are some examples of iconic model. Maps, photographs, drawings may also be
categorized as iconic models as they look like what they represent except in size. The advantage
of iconic models is that they are specific and represent the thing visually. But the disadvantage it
they cannot be manipulated for experimental purposes. They cannot be used to study the changes
in the operation of a system.

ii. Analogue Models

The analogue models use one set of properties to represent another set of properties. After the
problem is solved, it is interpreted in terms of the original system. For example the electrical
network model may be used as an analogue model to study the flows in a transportation system.
The contour lines on a map are analogues of elevation as they represent the rise and fall of
height. In general the analogue models are less specific and concrete as compares to the iconic
models and can be easily manipulated.

iii. Symbolic Models

In symbolic models letters, numbers and other types of mathematical symbols are used to
represent variables and the relationship between them. These are the most general and abstract
type of models. These models can be verbal or mathematical. The verbal models represent a
situation in spoken language or written words, whereas, mathematical models uses mathematical
notations to represent the situation. The difference between the two can be understood by taking
an example of measuring area of rectangle. A verbal model would express it as: The area of the
rectangle (A) is equal to multiplication of length (L)of the rectangle by its breadth(B) . Whereas
the mathematical model is represented as: A= L x B. Both the models yield same results,
however a mathematical model is more precise.

Symbolic models are used in Operations research as they are easier to manipulate and yield
better results as compared to iconic or analogue models.

1.6 Various Statistical methods


There are many statistical techniques which are useful for the decision maker in solving
problems. A brief explanation of some of the techniques is given below to orient you towards
them. Many of these techniques will be discussed in detail in later units.

Frequency distribution and Graphical representation

One a data is collected, it needs to be summarized and presented to the decision maker in a form
theta is easy to understand and comprehend. Tabulation helps this process through effective
presentation. Classification of the data showing the different values of the variable and their
respective frequencies of occurrence is called frequency distribution of the values. There are two
kinds of frequency distribution- discrete frequency distribution and continuous frequency
distribution. Graphical representation is more effective in communicating the information.
Through graphs and charts, the decision maker can often get an overall picture of the data and
reach very useful conclusions merely by studying the chart or graph.

Measures of Central tendency

The concept of central tendency plays an important role in the study and application of statistics.
There is an inherent tendency of the data to cluster or group around central value. This behavior
of the data to concentrate the values around central part of data is called as ‘Central tendency’ of
the data. Measures of central tendency enable to find that single value at which the data is
considered to be concentrated. Measures of central tendency helps to compare two or more sets
of data, for example average sales figures of two months. There are three common measures of
central tendency- Mean, Median and Mode. Mean is the most widely used measure. Arithmetic
mean is the average of a group of numbers and is computed by summing all numbers and
dividing by the number of observations. The median is the middle value in a set of data that has
been ordered from lowest to highest (ascending) or highest to lowest (descending).It is the value
that splits ordered data into two equal parts. The mode is the most frequently occurring value of a
set of data.

Measures of variability

Measures of variability explain the spread or dispersion of a set of data. It explains the variation
in the values and how different the values are from the mean. Usually measures of variability are
used together with the measures of central tendency to make a complete description of the data.
There are a number of measure of dispersion like- Range, Inter quartile range, mean absolute
deviation, variance and standard deviation

Probability Distribution

A random variable is a numerical description of the outcome of an experiment. A probability


distribution is classifying the outcome for random variables and their associated probabilities in
the form of a distribution. It states how probabilities are distributed over values of random
variable. There are two types of probability distribution-Discrete Probability Distribution and
Continuous Probability Distribution. When the random variable can take limited number of
values basically whole numbers, the probability distribution is discrete. However when the
random variable can take any value over a range (decimal values also), the probability
distribution will be continuous.

Correlation

Correlation is a measure of the degree of relatedness of variables.For example, how strong is the
correlation between the producer price index and the unemployment rate? In retail sales, are
sales related to population density, number of competitors, size of the store, amount of
advertising, or other variables? The correlation coefficient measures the degree of association of
one variable with other. The Pearson product-moment correlation(r) is used, when both
variables being analyzed have at least an interval level of data. The term r is a measure of the
linearcorrelation of two variables. It is a number that ranges from -1 to 0 to +1, representing the
strength of the relationship between the variables. An r value of +1 denotes a perfect positive
relationship between two sets of numbers. An r value of -1 denotes a perfect negative
correlation, which indicates an inverse relationship between two variables. An r value of 0 means
no linear relationship is present between the two variables.

Regression
Regression analysis is the process of developing a model to predict the value of a numerical
variable based on the values of other variables (one or more). The most elementaryregression
model is called simple regression or bivariate regression involving twovariables in which one
variable is predicted by another variable. In simple regression, the variable to be predicted is
called the dependent variable and is designated as y. The predictor iscalled the independent
variable, or explanatory variable, and is designated as x. In simpleregression analysis, only a
straight-line relationship between two variables is examined. In multiple regression, more than
one independent variables are used to predict the dependent variable.

Forecasting

Forecasting is the art or science of predicting the future values of a variable. Forecasting methods
can be classified as qualitative and quantitative. The quantitative methods can be used only
when the variable under study can be quantified and the historical data is available. A time series
data is a set observation of a variable measure over a period of time at regular intervals. The
objective of time series method is to discover a pattern in the historical data and then extrapolate
this pattern into the future.

Decision Theory

Decision theory also called as decision analysis, is used to determine optimal strategy where a
decision maker is faced with several decision alternatives and an uncertain pattern of future
events. All decision making situations have usually two or more alternative courses of action
available to the decision maker to choose from. There are various possible outcomes, called
states of nature, which are beyond the control of decision maker. A decision may be defined as
the selection of an act which is considered to be the best according to a predefined standard, from
the available options.

Index Number

Index number is a ratio of a measure taken during one time period to that same measure taken
during another time period, usually denoted as base period. The ratio is often multiplied by 100
and expressed as a percentage. These are very useful to reflect the inter-period differences. Using
index numbers, a researcher can transform the data into values that are more usable and make it
easier to compare other years to one particular key year. Index numbers are widely used among
the world to relate information about stock prices, inflation, sales, exports, imports, agriculture
prices etc. Some examples of specific indexes are employment cost index, price index for
construction, producer price index, consumer price index etc. For example, if the Consumer
Price Index for year 2020 is 150, it means the prices are gone up by 50 %. As the Consumer
Price Index (CPI-U) is compiled by the Bureau of Labor Statistics and is based upon a 1982 Base
year of 100.
Check Your progress
Check your progress 3
The variables whose calculation is done according to the height, length, and weight are categorised as

1. The variables whose calculation is done according to the height, length, and weight are
A) Discrete Variables
B) Flowchart Variables
categorised as
C) Measuring Variables
a) Discrete Variables
D) Continuous Variables
b) Flowchart Variables
c) art
the Measuring
or science Variables
of predicting
d) Continuous Variables
2. The art or science of predicting the future values of a variable is called
a) Regression
b) Forecasting
c) Probability distribution
d) Index numbers

1.7 Operations Research Tools and Techniques


Various tools and techniques of operational research are available. Some of the most widely used
techniques are- linear programming, game theory, decision theory, queuing theory, inventory
models , simulation, non-linear programming, integer programming, dynamic programming,
sequencing theory, Markov process, network scheduling (PERT/CPM) etc. A brief explanation
of some of the above tools/techniques is as follows:

Linear programming:
It is a mathematical modeling technique for selecting the best alternative from a set of feasible
alternative, in situations where the objective function as well as the constraints can be expressed
as linear mathematical function. The objective function may be maximization of profit /sales or
minimization of cost/time etc. There are many methods to solve a linear programming problem.

Transportation:
The transportation problem arises in planning for the distribution of goods and services from
various supply locations to different demand locations. Normally the quantity of goods available
at supply location (origin) is limited and the quantity of goods required at demand location
(destination) is known. Mostly the objective is to minimize the total transportation cost of
shipping the goods from origin to destination

Assignment
An assignment problem arises in many decision making situation in an organization like
assigning jobs to machines, workers to machines, clerk to counters, sales personnel to sales
territories etc. It is a special type of linear programming, with the constraint that one job can be
assigned to one and only one machine.

Game theory
Game theory is used to make decisions in conflicting situations in which where there are one or
more players/ adversaries/ opponents. Each player selects a strategy independently without
knowing in advance the strategy of other player or players. The combination of the competing
strategies provides the value of the game to the players. Game theory applications have been
developed for situations in which the competing players are teams, companies, political
candidates, armies or contract bidders.

Project scheduling
Managers are responsible for planning, scheduling and controlling projects that consists of
numerous jobs or tasks performed by various departments or individuals. The Program
Evaluation and Review technique (PERT) and Critical Path method (CPM) are extremely helpful
in these situations. The objective is to complete the project on time, adhering to the precedence
requirements (which mean some activities should be completed, before other activities can be
started).

Waiting line theory


There are many situations where a queue is formed like customers waiting for service, machines
waiting for repair work, jobs waiting for processing in computers. The objective is to minimize
the cost of waiting without increasing the cost of servicing. Waiting line models consists of
mathematical formulas and relationships that can be used to determine the operating
characteristics for the waiting line. A waiting line is also known as queue in and the body of
knowledge dealing with waiting lines is known as queuing theory.

Simulation
Simulation is one of the most widely used quantitative approaches of decision making. It
involves developing a model of some real phenomenon and then performing experiments on the
model evolved. It is a descriptive and not an optimizing technique. In simulation a given system
is copied and the variable and the constants associated with it are manipulated in the artificial
environment to study the behavior of the system.

Check your Progress 4


1. The technique used to make decisions in conflicting situations in which where there are
one or more players.
a) Decision Theory
b) Waiting line theory
c) Simulation
d) Game theory

1.8 Importance of Quantitative techniques

Quantitative methods provide the managers with a variety of tools from statistics and operational
research for handling problems in modern business a scientific way.

1.8.1 Advantage of Statistics in Business

1. Give accurate and specific description: The facts can be conveyed in a precise form when
stated quantitatively using statistics. For example the statement that infant mortality rate is 30 %
in 2018, as compared to 35 percent in 2015, is more specific than stating that the infant
mortality rate in 2018 had decreased in comparison to year 2015.

2. Convert data into information: Statistics help in reducing the amount of data collected and
convert it to more meaningful information for making decisions. For example the census data of
individual household on the number of members is a huge mass of data and it will be difficult to
draw any conclusions without applying statistics.

3. Facilitate Comparison of data: It helps in the comparison of data, as the data is collected in
the form of numbers. The present data can be compared with the previously collected data to
study the pattern of increase or decrease in a phenomenon. For example there can be a
comparison of month-wise sales figure data of a company to identify the trend.

4. Forecast future events: Statistical methods are very useful in predicting a future events. For
example to take the decision on production scheduling, an automobile manufacturer would like
to know the past sales figures. Based on these figures, future sales can be predicted and
accordingly the required number of automobiles can be manufactured

5. Formulate and test assumption: An important application of statistics is formulating


assumption about the population and testing it on the basis of the sample data collected. For
example a hypothesis can be made that work from home is more productive in IT industry. A
survey can be conducted of employees of IT sector and various hypothesis testing tools can be
applied to draw conclusions.

1.8.2Advantage of Operations research in Business

Tools for scientific analysis: Operations research models, provides a systematic, scientific and
logical way of understanding and solving problems. It is not possible to take decisions based on
intuitions due to increased complexities of business. These techniques help the decision maker to
provide the description and solution of the problem more precisely.

Provide solutions of business problems: Operation research techniques provide solutions to


almost every area of a business.These techniques are used in various area like production,
marketing, finance, and other areas to find solution to question like how much inventory should
be carried for minimizing the cost?.

Optimum allocation of resources-Resource allocation can be considered optional if for a given


level of output, the production is done at minimum cost or at a given cost, maximum output is
produced. Operation research tools like linear programming, transportation, assignment etc;
enables a manager to optimally allocate the resources in an organization.

Choosing an optimal business strategy– Using operations research techniques like Game
theory, it is possible to determine the optimal strategy for an organization that is facing
competition from its rivals with conflicting interests.
Facilitate and improve the quality of decision making-A decision maker can use various
mathematical models to take better informed decision in the face of uncertainty. The operation
research techniques like decision theory, improve the quality of decision making. Multiple
variables or resources can be formulated and manipulated as a model to take optimum decisions.

1.9 Applications of quantitative methods in Business and management


Managers in all functional areas use statistics and operations research methods to make better
informed decisions.

1.9.1Application of Statistics in Business:


1. Accounting and Finance
Budget preparation, Financial forecast, investment decisions, Credit risk and policies,
auditing function
2. Production
Production planning and control, machine performance evaluation, Inventory control
quality control
3. Manufacturing
Inventory control, production planning and scheduling, production smoothing, quality
control, reducing wastage
4. Marketing
Analysis of marketing data, sales forecasting
5. Human Resource/ Personnel Management
Labor attrition rate, employment trends, performance appraisal, wage rates, incentive
plans
6. Economics
Measurement of GDP, input –output analysis, business cycle and seasonal fluctuations,
comparison of market prices, cost and profit, population analysis, economic policy
evaluation
7. Basic Sciences
Study of plant life, efficacy of a drug, development of vaccines, Diagnosis of disease
based on data like temperature, BP , pulse rate, weight etc.
8. Research and Development
Development of new product lines, evaluation of existing products

1.9.2Applications in Operations Research in Business


1. Accounting
Credit Policy Analysis, Cash flow planning, planning account strategy, assigning auditing
teams, establishing cost
2. Construction
Project planning, scheduling and control, deploying work force, allocation of resources to
projects
3. Finance
Portfolio analysis, Investment analysis, Building financial models, capital allocation
decisions, cash management models, dividend policy decisions.
4. Manufacturing
Inventory control, production planning and scheduling, production smoothing, quality
control, reducing wastage
5. Marketing
Budget allocation for advertising, Product mix decisions, New product introduction,
effective packaging, promotion decisions
6. Human Resource
Human resource planning, recruitment, training programming schedule, assignment
balancing skills, designing organizational structure
7. Purchasing
Optimal ordering and reordering, optimal purchase, Material transfer
8. Facility Planning
Facility location, Estimating number of facilities requirement, transportation decisions,
warehouse location decisions, Logistic system design
9. Research and Development
R and D budget, R & D control of projects, planning of introducing product

Check your Progress 5


1. Review of performance appraisal, labour turnover rates, planning of incentives, and
training programs are examples of
a) Statistics in Production
b) Statistics in Marketing
c) Statistics in Finance
d) Statistics in Personnel Management
2. Credit Policy Analysis, Cash flow planning, planning account strategy, assigning
auditing teams, establishing cost are examples of
a) Statistics in Production
b) Statistics in Marketing
c) Statistics in Finance
d) Statistics in Personnel Management

1.10 Let Us Sum Up


This course is about how quantitative methods may be used to help managers make better
decisions. This unit attempted to explain the meaning and use of various quantitative analysis
methods in the field of business and management. The two branches of quantitative analysis-
Statistics and operations research are discussed in detail. Statistics is a science dealing with the
collection, analyze, interpretation and presentation of numerical data. Data gathered on a group
to describe or reach conclusions about that same group is called descriptive statistics.Data
gathered from a sample and statistics generated to reach conclusions about the population from
which the sample was taken is known as inferential statistics.

Operations Research is a method of employing mathematical representations or models to


analyze business problems to take management decisions. The discussion in this unit was
centered on the problem orientation of quantitative methods and an overview of how
mathematical models can be used in analysis. Mathematical models are abstractions of real
world situations and may not be able to capture all the aspects of the real situation. However if a
model can capture the major relevant aspects of the problem and can provide a recommended
solution, it can be valuable in decision making.

Various methods used in statistics and operations research in discussed in brief. The benefits
and advantages of quantitative methods along with their applications in various functional areas
were also covered in this unit. The importance and complexity f decision making process has
resulted in wide application of quantitative techniques.

1.11 Answers for Check Your Progress

Answers to check your progress 1


1. (a)
2. (a)
3. (b)
Answers to check your progress 2
1. (b)
2. (c)
Answers to check your progress 3
1. (d)
2. (b)
Answers to check your progress 4
1. (d)
Answers to check your progress 5
1. (d)
2. (c)

1.12 Glossary

Statistics: Statistics is a science dealing with the collection, analyze, interpretation and
presentation of numerical data.
Descriptive statistics:Data gathered on a group to describe or reach conclusions about that same
group
Inferential Statistics: Data gathered from a sample and statistics generated to reach conclusions
about the population from which the sample was taken.
Primary Data: The data used in the study is collected specifically for the purpose of the study
Secondary Data: The data was collected for some other purpose and is derived from the other
sources.
Statistic: It is a descriptive measure computed from a sample of data.
Parameter: It is a descriptive measure computed from an entire population of data.
Random variable is a numerical description of the outcome of an experiment.
Discrete Random Variable: Therandom variable that can take limited number of values
basically whole numbers
Continuous random Variable: The random variable can take any value over a range (decimal
values also).
Operations Research:It is a method of employing mathematical representations or models to
analyze business problems to take management decisions.
Model: A representation of real object or situation
Iconic Model: A physical replica or representation of a real object
Analog Model:Analogical models are a method of representing a phenomenon of the world, by
another, more understandable or analyzable system.
Mathematical Model:Mathematical symbols and expressions used to represent a real situation

1.13Assignment

2. What is Statistics?Explain types of statistics with examples


2 Discuss stages of operation research in detail.
3 Describe the various types of operation research models.
4 List atleast five techniques used in statistics and operations research.
5 Describe the advantages of quantitative methods
6 Discuss applications of statistics and operations research in functional area of management.

1.14Activities
Take an example of a major decision you have taken recently. List the steps you had taken to
reach the final decision.

1.15Case Study
A manufacturing company makes electric wiring, which it sells to contractors in the
construction industry. Approximately 900 electric contractors purchase wire from the company.
The Director marketing wants to determine the electric contractor’s satisfaction. He developed a
questionnaire that yields a satisfaction score between 10 and 50 for participant responses. A
random sample of 35 of the 900 contractors is asked to complete the survey. The satisfaction
score for the 35 participants are averaged to compute average satisfaction score.
1. Describe population and sample for this study.
2. What will be the statistic and parameter for this study
3. How can finding of this study used in decision making?
1.16 Further Reading

1. Applied Business Statistics, Ken Black, Wiley Publications.


2. Business Statistics, David M. levine et al, Pearson Education
3. Statistics for management, Levin and Rubin, Pearson Education
4. Operations Research, By Hamdy A Taha, Pearson Education
5. Operations Research theory and Applications by J.K. Sharma, Macmillan India Ltd.
6. Quantitative techniques in Management, by N.D. Vora, McGraw hills
7. Quantitative methods for business, by Anderson, Sweeney and Williams, Thompson
publication
Unit No. 2 Measures of Central Tendency
_________________________________
Unit Structure

2.0Learning Objectives
2.1Introduction
2.2Measures of Central Tendency

2.2.1Importance of measures of central tendency


2.2.2Properties of a good measure of central tendency
2.2.3 Common measures of central tendency
2.3 Arithmetic Mean
2.3.1 Arithmetic mean for Grouped data

2.3.2 Weighted Arithmetic mean

2.3.3 Geometric mean

2.3.4 Harmonic mean

2.4 Median

2.4.1Median for Ungrouped data

2.4.2 Median for grouped Data


2.5 Mode

2.5.1Mode for grouped Data

2.6 Quartile

2.7 Comparative Analysis between Mean , Median and Mode

2.7.1Relationship between Mean Median and Mode

2.8 Let Us Sum Up


2.9 Answers for Check your Progress
2.10 Glossary
2.11 Assignment
2.12 Activities
2.13 Case Study
2.14 Further Reading
2.0 Learning Objectives
After learning this unit, you will be able to:
• Understand the meaning of Central tendency
• Understand the Importance of measures of centrals tendency
• Compute various measures of central tendency- arithmetic mean, weighted mean,
geometric mean, harmonic mean, median and mode
• Explain the relationship between mean , median and mode

2.1 Іntroductіon
In the introductory chapter, an overview of the various types of statistical methods used in
management decision making was explained. The purpose of descriptive statistics is to describe
and summarise the data. Descriptive statistics include various measures like- Measures of central
tendency, measures of variation, measure of shape and measures of kurtosis. Measure of central
tendency is one of the most important and widely used tool for describing and summarizing the
data. In this unit we will be exploring the concept of central tendency and various measures used
to measure central tendency. The objective is to identify a single value which can act as a
representative of the given data. This value can be used to make conclusion and decision related
to the entire data set. The computation of various measures is different for ungrouped and
grouped data and hence will be discussed separately.

2.2 Measure of Central Tendency


The concept of central tendency is an integral part of statistics. It is being observed that, any set
of data has an inherent tendency to cluster or group around a central value. For example in a
class test of 30 marks, it can be easily assumed that most of the students will be getting marks
between 10 and 20. This tendency of the data to group or fall in the middle part of the data set is
known as “Central Tendency’ and the methods used to measure this tendency of data are known
as Measures of Central tendency

2.2.1Importance of measures of central tendency

Measures of central tendency, enables us to get an idea of the entire data from a single value
where the data is considered to be concentrated. For example, it is impossible to remember the
sales figures of various retail outlets in a region. But the average could be used to make
conclusions about the sales of the entire region. The average condenses a great amount of data,
into a single representative value, so that data can be summarized easily. Measure of Central
tendency also enables to compare two or more sets of data. For example the average sales figures
of two brands in the same product category can be compared.
2.2.2Characteristics of a good measure of central tendency
A good measure of central tendency should posses as far as possible, the following
characteristics:
• Easy to understand
• Easy to compute
• Based on all the observations
• Uniquely defined
• Possibility of further algebraic treatment
• Not unduly affected by extreme values

2.2.3 Common measures of central tendency


These are the most common measures of central tendency used in business application:
1. Mean
2. Median
3. Mode
Some of the other measures used are quartiles, deciles and percentiles. Each of them has its
advantages and disadvantages. Here we will be discussing the concepts and its methods of
manual calculation. However these can be easily calculated using MS Excel

2.3 Arithmetic mean


The arithmetic mean (typically referred to as mean or average) is the most common measure of
central tendency. The mean is the only measure in which all the values play an equal role. Mean
is calculated by adding all the values in the data set and then dividing that sum by the number of
values in the data set.
The population mean is represented by the Greek letter ‘µ’. For a population containing ‘N’
values the equation for the mean of a population is written as:

∑𝑥𝑖 x1 +x2 +x31 +....+xn


𝜇= =
N N
The sample mean is represented by symbol x , called x bar .The formula for computing sample
mean for ‘n’ values is written as follows:

∑𝑥𝑖 x1 +x2 +x31 +....+xn


x= =
n n
For example the age of employees in a company is : 39,29,43,52,39,44,40,31,44 and 35. As can
be seen from the data, there are ten employees in the organization. The population Arithmetic
mean can be calculated as
39+29+43+52+39+44+40+31+44+35
𝜇= 10
396
= 10
= 39.6
Therefore the average age of employees in the organization is 39.6 years
The calculation of the sample mean uses the same formula as for the population mean and would
have resulted in the same answer, if computed on the given data. However it is inappropriate to
compute the sample mean for a population and a population mean for a sample. It should be
noted that as the entire employee’s data was included, it is population data. If a sample of five
out of ten employees was taken, then we would have calculated sample mean. In statistics it is
important to clearly differentiate between a sample and population data
2.3.1 Arithmetic mean for grouped Data
We have already seen how to compute the arithmetic mean of ungrouped data. When the data is
classified in the form of a frequency distribution, we are working with grouped data. With
grouped data, the specific values are unknown, as the data is in the form of class intervals. The
mod point of each class interval is used to represent all the values in the class interval. This
midpoint is weighted by the frequency of values in the class interval.
The arithmetic mean for grouped data is computed by summing the product of class
midpoint(Mi) and the class frequency(fi) of each class and dividing that sum by the total number
of frequencies(N). The formula for the mean of grouped data is as follows:
∑𝑓𝑖 𝑀𝑖 ∑𝑓𝑖 𝑀𝑖
𝜇= =
𝑁 𝑓𝑖

This method is illustrated with the help of the given data on the age group of people in a area.

Age Group Frequency Age Group Frequency

18-24 17 48-54 30

24-30 22 54-60 32

30-36 26 60-68 21

36-42 35 68-72 15

42-48 33

For calculation the arithmetic mean, we need the following table

Class Interval Fi Mi fiMi

18-24 17 21 357

24-30 22 27 594

30-36 26 33 858

36-42 35 39 1365
42-48 33 45 1485
48-54 30 51 1530

54-60 32 57 1824

60-68 21 63 1323

68-72 15 69 1035
fi= 231 fiMi= 10,371

∑𝑓𝑖 𝑀𝑖 ∑𝑓𝑖 𝑀𝑖 10371


𝜇= = = = 44.896
𝑁 𝑓𝑖 231
Hence the average age of people in the area is 44.90 years
To simplify the calculation, there is a short cut method to calculate arithmetic mean. An
arbitrary selected constant value is assumed as the mean. This value is selected in a way that
it simplifies the values in calculation by using the deviation of each observation instead of
the actual data value. The formula of the assumed mean method uses the following
equation:
∑𝑓𝑖 𝑑𝑖
𝜇 =𝐴+
𝑁
Where, A is the arbitrary selected constant value
di= Deviation of each observation from the assumed mean
N= ∑𝑓𝑖 =Number of observations
To apply the formula, let us consider the following distribution of marks of 100 students in an
examination:
Class 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Interval
Frequency 5 3 4 7 2 6 13

Class Interval Mid point fi di fidi


10-20 15 5 -30 -150
20-30 25 3 -20 -60
30-40 35 4 -10 -40
40-50 45 7 0 0
50-60 55 2 10 20
60-70 65 6 20 120
70-80 75 13 30 390
∑𝑓𝑖 =40 ∑𝑓𝑖 𝑑𝑖 =280
∑𝑓𝑖 𝑑𝑖 280
𝜇 =𝐴+ = 45 + = 52
𝑁 40
The arithmetic mean of the marks scored by students is 52. You can easily see that this method is
simpler to calculate and hence will give faster solution. The same problem if solved using the
previous method will yield the same answer.

2.3.2Weighted Arithmetic mean

In calculating arithmetic mean, an equal importance is given to all the observations. But there are
situations where relative importance of different values is not the same. In such case, weighted
arithmetic mean need to be used. The procedure is similar to the calculation of grouped data
arithmetic mean, where frequency is used as the weight associated with the class interval. For
example for the data value x1,x2 , x3 ….. xnand associated weights w1, w2 , w3 ….. wn, the
weighted arithmetic mean can be computed using the formula:

∑𝑤𝑖 𝑥𝑖 𝑤𝑖 𝑥𝑖 × 𝑤2 𝑥2 ×⋅⋅⋅⋅⋅⋅⋅⋅× 𝑤𝑛 𝑥𝑛
𝜇𝑤 = =
∑𝑤𝑖 𝑤1 + 𝑤2 +⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 𝑤𝑛
You are aware about the use of weighted averages, when the various components of
evaluation are not equally important. For example your final grade is composed of 30
percent of mid term score, 50 percent of final exam score and 20 percent for assignment.
Then the final grade will be calculated by multiplying the score (xi) by the weight (wi) of
each score :

∑𝑤𝑖 𝑥𝑖 30𝑥1 + 50𝑥2 + 20𝑥3


𝜇𝑤 = =
∑𝑤𝑖 30 + 50 + 20
So if you score 85 marks in mid term, 75 in final exam and 90 in assignment, then the
weighted average will be:
30 × 85 + 50 × 75 + 20 × 90
𝜇𝑤 =
100
8100
= = 81
100
Some of the common applications of weighted arithmetic mean is calculating index
numbers like consumer price index, BSE sensex etc. , where different weights are
associated with items or shares.

2.3.3 Geometric Mean

When we are dealing with quantities that change over a period of time and we need to find the
average rate of change, such as average growth rate or depreciation rate over a period of time. In
such cases, simple arithmetic mean is inappropriate, as it will result in wrong answer. The
appropriate measure of central tendency will be geometric mean.
The geometric mean is defined as nth root of the product of ‘n’ values of the data. If
x1,x2,x3…..xnare the values of the data then Geometric mean is equal to:

𝐺𝑀 = 𝑛√𝑥1 × 𝑥2 × 𝑥3 × … . .× 𝑥𝑛

When the number of observations are more, to simplify the calculations, logarithmic
transformations can be applied. Taking log on both the sides, the formula becomes:

∑𝐿𝑜𝑔(𝑥𝑖 ) 1
𝐿𝑜𝑔(𝐺𝑀) = = (𝐿𝑜𝑔𝑥1 + 𝐿𝑜𝑔𝑥2 + 𝑙𝑜𝑔𝑥3 +⋅⋅⋅⋅⋅⋅⋅⋅⋅ +𝑙𝑜𝑔𝑥𝑛 )
𝑛 𝑛
∑𝐿𝑜𝑔(𝑥𝑖 )
𝐺𝑀 = 𝐴𝑛𝑡𝑖𝑙𝑜𝑔 { }
𝑛
Geometric mean is useful to find the average percentage increase in sales, production, population
etc. It is the most representative average in the construction of index numbers. When large
weights are to be given to smaller values and small weights to larger values,the most appropriate
average to be used is geometric mean. Let’s take an example to understand computing of
geometric mean.
Inflation rate in percentage for the past six months is given as 5.5, 6.2, 7.2, 6, 6.5 and 5.9. Find
average inflation rate over the past six months

First, we find the index by dividing the percentage rate by 100 and then adding 1. Then we take
the GM of this index as average index. From this we can find out the average inflation rate.
6 6
𝐺𝑀 = √1.055 × 1.062 × 1.072 × 1.06 × 1.065 × 1.059 = √1.4359 = 1.062

Thus the average inflation rate=6.2 %

2.3.4 Harmonic Mean

Harmonic mean is defines as the reciprocal of the arithmetic mean of the reciprocals of the
individual observations. If x1,x2,x3…..xn are the values of the data then Harmonic mean is given
by the formula:
𝑛 𝑛
𝐻𝑀 = 1 1 1
= 1
(𝑥 + 𝑥 + ⋯ ⋯ + 𝑥 ) ∑
1 2 𝑛 𝑥𝑖

Harmonic mean is appropriate if the data values are ratios of the two variables with different
measures called rates. The harmonic mean is very useful for computing average speed of a
journey or average price of a product at which it is sold. In finance harmonic mean is used to
determine the average of financial multiples like P/E ratio.

Let’s take an example to understand the computational procedure of harmonic mean.

A journey from place X to Y is completed using four different cars. The average speed of each of
the car is 50 km/hr,75km/hr, 60 km/hr and 80km/hr. Find the average speed of the journey.
The average speed of the journey is calculated as :

4
𝐻𝑀 = 1 1 1 1 = 64 𝑘𝑚/ℎ
+ + +
50 75 60 80

Like arithmetic mean and geometric mean, in harmonic mean also are the values are used for
computation of the average. However harmonic mean cannot be used when one or more
observations have zero value or the observations can take both positive and negative values.
Harmonic mean had very limited applications in business

Check your Progress 1


1. What is the major assumption we make when computing a mean from grouped data
a) All values are discrete
b) Every value in a class is equal to the midpoint
c) No value occurs more than once
d) Each class contains same number of values
2. When calculating the average rate of debt expansion for a company, the correct mean
to use is the
a) Arithmetic mean
b) Weighted arithmetic mean
c) Geometric mean
d) Either (a) or (c)

3. The following frequency distribution has been constructed from about the Air
transport traffic data. Calculate the arithmetic mean
No of passengers 20-30 30-40 40-50 50-60 60-70 70-80
travelling
No of airports 8 7 1 0 3 1

4. The management of a restaurant has employed 2 waiters, 5 cooks and 10 waiters. The
monthly salaries of the managers, cooks and waiters are 30000, 20000 and 10000 per
month respectively. Find the mean salary paid per month by the management.

2.4 Median
Median is a measure of central tendency different from all the averages we have discussed
so far. Median is the middle value in a set of data that has been arranged in ascending or
descending order. While computing various types of means all the values in the data set are
used, whereas median is a single value from the data set that is the middle most or central
item in the set of numbers. Half of the values lie above the point and the other half lie below
it measures the central item in the data
2.4.1 Median for Ungrouped data
To find the median of ungrouped data, first arrange the data in ascending or descending
order. If the data set contains an odd number of values, the middle item (median) is one
of the original observations. If there is an even number of values, the median is the
average of the two middle observations. The formula for median is:
𝑁+1
𝑀𝑒𝑑𝑖𝑎𝑛 = ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑎𝑟𝑟𝑎𝑦
2
Suppose we want to find the median of a data set containing seven observations. Then as
per the above formula,the median is the (7+1)/2= 4th value in the data set. Lets take an
example of the data on time taken to complete a task daily. First the data has to be
arranged in ascending order:

Ordered data

29 31 35 39 39 40 43 44 44 52
𝑁+1 (10 + 1)
𝑀𝑒𝑑𝑖𝑎𝑛 = ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = = 5.5𝑡ℎ 𝑖𝑡𝑒𝑚
2 2
As the median is at the 5.5th item, we will be taking an average of 5th and the 6th value,
which is 39 and 40. Therefore the median is (39+40)/2= 39.5. The median of 39.5 means
that for half of the days, the time taken to do the task is less than or equal o 39.5minutes
and for half of the days the time taken to do the task is greater or equal to 39.5 minutes.

2.4.2 Median for grouped Data

For the grouped data, we first find the N/2 value. Then from the cumulative frequency we
find the class in which N/2th item falls. Such a class is called median class. Then the
median is calculated using the following formula:
𝑁
− 𝑐𝑓𝑝
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 (𝑤)
𝑓𝑚𝑑
where, L= lower limit of the median class
cfp= cumulative frequency of the class preceding the median class
fmd = frequency of the median class
w = width of the class .

As an illustration, consider the frequency distribution of 60 years of unemployment data.

Class Interval Frequency Class Interval Frequency

1-3 4 7-9 19

3-5 12 9-11 7

5-7 13 11-13 5

To facilitate the process of locating the median class, let’s find the cumulative frequency.
Class Interval Frequency Cumulative frequency

1-3 4 4

3-5 12 16

5-7 13 29

7-9 19 48

9-11 7 55

11-13 5 60

Median = N/2th value= 60/2= 30th Value. Let’s understand how to locate the median class
using the cumulative frequency column. It can be seen that 1st to 4th value lies in class 1-3,
from 5th to 16th in the second class, 17th to 29th in third class and from 30th to 48th in the
fourth classand similarly the rest of the values. Thus the 30th value lies in the class interval
7-9.
𝑁
− 𝑐𝑓𝑝
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 (𝑤)
𝑓𝑚𝑑
60
− 29
2
=7+ × (2) = 7 + 0.105 = 7.105
19
The median value of the unemployment rates is 7.105.

Like the grouped arithmetic mean, the median is a approximate value. It is based on the
assumption that the actual value fall uniformly across the median class interval, which may
not be always true.

Check your Progress 2


1. Which of the following is the first step in calculating the median of a data set?
a) Average the middle two values of the data set
b) Arrange the data in order
c) Determine the relative weights of data values
d) None of these
2. For the following data, compute the median and interpret the value

Class: 0-1 1-2 2-3 3-4 4-5 5-6


Frequency: 1 4 8 6 3 1
2.5 Mode
Mode is a measure of central tendency that is similar to median because it is also not
arithmetically calculated like mean. The mode is the value that is repeated most often in the
data set. In case there is a tie for the most frequent value, the data is said to be bimodal.
Data set with more than two modes are called as multimodal.

Mode is rarely used as a measure of central tendency for ungrouped data, as sometimes a
single unrepresentative value might have occurred just by chance. For example as in the
data series 1,2,2,3,3,4,4,5,5,6,7,7,8,9,9,12,12,and 12 , the mode is 9 as it occurs maximum
number of times. But as it can be observed it is not representing of the central part of the
data and most of the values are actually below 10.

2.5.1 Mode of grouped data.

When data is grouped in the form of frequency distribution, it is assumed that the mode is
located in the class with the most items. The class with the highest frequency will be
called the modal class. To determine the mode from the Modal class, the given formula
will be used:
𝑑1
𝑀𝑜𝑑𝑒 = 𝐿 + ( )𝑤
𝑑1 + 𝑑2

Where, L= is the lower limit of the modal class d1=f1-f0 d2=f1-f2


f1 = frequency of the modal class
f0 = frequency preceding the modal class
f2 = frequency succeeding the modal class
w= width of class interval

To illustrate the computation of mode, let’s consider the following data:

Class: 15-20 20-25 25-30 30-25 35-40


Frequency: 10 9 3 4 4

The modal class is 15 -20, as the highest frequency is 10. Let’s substitute the values in the
given formula

𝑑1 = 𝑓1 − 𝑓0 = 10 − 0 = 10 𝑑2 = 𝑓1 − 𝑓2 = 10 − 9 = 1
𝑑1
𝑀𝑜𝑑𝑒 = 𝐿 + ( )
𝑑1 + 𝑑2
10)
= 15 + ( ) 5 = 19.55
10 + 1
The mode of the age of students enrolled for the programme is 19.55 years.
Check your progress 3
1. Compute mode

Class: 0-1 1-2 2-3 3-4 4-5 5-6

Frequency: 1 6 10 9 3 1

2.6 Quartiles
Quartiles are related positional measures of central tendency. There are useful and quit
frequently used measures. The most familiar positional averages are – quartiles, deciles and
percentiles

Quartiles: Quartiles are values that divide the data into four equal parts. To divide data
into four parts we need three partitions and these are called - Quartile 1, Quartile 2 and
Quartile 3. The first quartile Q1 is such that 25% of the values are smaller and 75 % of the
observations are higher than this value.The second quartile Q2 is the median as 50% of the
values are smaller and 50 % of the observations are larger than it.The third quartile Q3
divides in such a way that 75% of the values are smaller and 75 % of the observations are
larger than Q3.
𝑁
The quartile is located at the 𝑖 4 th item of the data set. The class in which the quartile lies is
known as the quartile class. The formula of computing quartile for grouped data, similar to
median formula is as follows:
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3
𝑓𝑞
where, L = lower limit of the quartile class
cfp= cumulative frequency of the class preceding the quartile class
fq = frequency of the quartile class
w = width of the class .

Deciles: Deciles are values, that divide the data into ten equal parts. Since we need nine
points to divide data set into ten parts, there are nine deciles denoted as D1, D2, D3, …..D9.
𝑁
The decile is the 𝑖 10th item of the data set, wherei=1 ,2, 3,….9. The class in which the
decile falls is known as the decile class. The formula of computing decile for grouped data
is:
𝑁
𝑖 10 − 𝑐𝑓𝑝
𝐷𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3 … .9
𝑓𝑑
where, the symbols have usual meaning and interpretation
Percentiles: Percentiles are the value, which divides the data into hundred equal parts.
There are ninety nine percentiles, denoted as P1, P2, P3……P99.The percentile is located at
𝑁
the 𝑖 100th item of the data set. The formula is:
𝑁
𝑖 100 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3 … … .99
𝑓𝑝
where, L = lower limit of the quartile class
cfp= cumulative frequency of the class preceding the quartile class
fp = frequency of the quartile class
w = width of the class.

To illustrate the computation of quartiles, deciles and percentiles, consider the following
data on sales of companies in lakhs.

Sales ( in lakhs): 0-10 10-20 20-30 30-40 40-50 50-60

Frequency: 12 18 27 20 17 6

Calculate Q1 ,Q3,D6 and P80

Solution:

Sales Frequency Cumulative Frequency

0-10 12 12

10-20 18 30

20-30 27 57

30-40 20 77

40-50 17 94

50-60 6 100
𝑁 100
Q1 = (𝑖 4 )thitem = (1 4 ) = 25th item which falls in the class 10-20 , as the cumulative
frequency of this class is 30. Substituting the relevant values in the formula
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤)
𝑓𝑞
100
1 − 12
4
𝑄1 = 10 + × (10) = 17.22
18
This value of Q1 suggest that 25% of the company’s sales are Rs. 17.22 lakhs or less than
that and 75% of the company’s sales figures are more than that.
𝑁 100
Q3 = (𝑖 4 )thitem = (3 4 ) = 75th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤)
𝑓𝑞
100
3 − 57
4
𝑄3 = 30 + × (10) = 39
20
This value of Q3 suggest that 75% of the company’s sales are Rs. 39 lakhs or less than that
and only 25% of the company’s sales figures are more than that.
𝑁 100
D6 = (𝑖 10)thitem = (6 10 ) = 60th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 10 − 𝑐𝑓𝑝
𝐷𝑖 = 𝐿 + (𝑤)
𝑓𝑑
100
6 − 57
10
𝐷6 = 30 + × (10) = 31.5
20
This value of D6 suggest that 60% of the company’s sales are Rs. 31.5 lakhs or less than
that and only 40% of the company’s sales figures are more than that.
𝑁 100
P80 = (𝑖 100)thitem = (80 100) = 80th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 100 − 𝑐𝑓𝑝
𝑃𝑖 = 𝐿 + (𝑤)
𝑓𝑝
100
80 100 − 77
𝑃80 = 40 + × (10) = 41.77
17
This value of P80 suggest that 80% of the company’s sales are Rs. 41.5 lakhs or less than
that and only 20% of the company’s sales figures are more than that.

Check your Progress 4

1. Fractiles that divide the data into 100 parts is called_____________


2. Second quartile is same as the Median( True/False)
3. Interpret the meaning of P30= 20 __________
2.7 Comparative Analysis between Mean Median and Mode

Let’s try to summarise the difference between three major measures of central tendency

Mean Median Mode


Mean of the data set is sum Median is the middle value Mode is the most frequently
of the data values divided by of the data set arranged in occurring value
number of observations ascending or descending
order
Based on all the observations It is a single value and does It is a single value and does
not use all the information not use all the information
available in data available in data
Uniquely defined Medians may not be unique Not uniquely defined for
Multimodal distribution
Affected by extreme values Not affected by extreme Not affected by extreme
values values
Unable to compute mean for Calculated from any kind of Calculated from any kind of
open-ended classes. data with open-ended classes data with open-ended classes.
Can be treated Cannot be treated Cannot be treated
algebraically.ie averages of algebraically algebraically
different groups can be
combined
Cannot be used for Can be used even for Can be used even for
qualitative data qualitative data qualitative data

2.7.1Relationship between Mean Median and Mode

A distribution of data, in which the right half is mirror image of the left half is said to be
symmetrical. One example of symmetrical distribution is normal distribution or bell shaped
curve. In a symmetrical distribution, mean, median and mode all coincide at the same point. If
the distribution is skewed, the mean, median and mode are not equal. In a moderately skewed
distribution, the distance between mean and median is approximately one third of the distance
between the mean and the mode. This can be expressed as
1
𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 = (𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒)
3
𝑀𝑜𝑑𝑒 = 3 𝑚𝑒𝑑𝑖𝑎𝑛 − 2 𝑀𝑜𝑑𝑒

Thus if we know values of any two measure of central tendency, the third measure can be
approximately determined in any moderately skewed distribution. The curves (a) and (c) are
examples of moderately skewed distribution. A skewed distribution can be of two types-
(1)Negatively skewed distribution (2) Positively skewed distribution.
A negatively skewed distribution is skewed to the left with a long left tail and a positively
skewed distribution is skewed to the right with a long right tail. It can be observed from the
above curves that the relationship between mean median and mode is as follows:

𝑀𝑒𝑎𝑛 < 𝑀𝑒𝑑𝑖𝑎𝑛 < 𝑀𝑜𝑑𝑒 (𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑙𝑦 𝑠𝑘𝑒𝑤𝑒𝑑 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛)


𝑀𝑜𝑑𝑒 < 𝑀𝑒𝑑𝑖𝑎𝑛 < 𝑀𝑒𝑎𝑛 (𝑃𝑜𝑠𝑖𝑖𝑣𝑒𝑙𝑦 𝑠𝑘𝑒𝑤𝑒𝑑 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛)

However, it can be observed that in any skewed distribution, the median lies between the mean
and the mode. When the population is skewed negatively or positively, the median is often the
best measure, as it is always between mean and mode. The median is not as highly influenced by
the frequency of occurrence of a single value as in the mode nor is it pulled by extreme values as
in the mean.

Check your Progress 5


1. When the distribution is symmetrical and has one mode, the highest point on the
curve is
a) Mode
b) Median
c) Mean
d) (b),(c) and (d)
2. When the curve tails off to the left end, it is called
a) Symmetrical
b) Positively skewed
c) Negatively skewed
d) All of these
2.8 Let Us sum up
Measures of central tendency, is branch of descriptive statistics, which helps to describe the
characteristic of the data. The most common measures of central tendency are – mean median
and mode. In addition quartiles, deciles and percentiles are measures of central tendency. Any
one of the measures may be used, based on the data and its application. These measures are
computed differently for ungrouped and grouped data. The arithmetic mean is computed using
all values and so can be influenced by extreme values. A median is unaffected by the magnitude
of extreme values. This characteristic makes median a most useful measure of location,
especially for skewed distribution. Mode should be used when the most occurring value needs to
be found

2.9 Answers for check your progress


Answers to check your Progress 1
1. (b)
2 (c)
3 AM=38
Class Interval Mid point fi fimi
20-30 25 8 200
30-40 35 7 245
40-50 45 1 45
50-60 55 0 0
60-70 65 3 195
70-80 75 1 75
∑𝑓𝑖 =20 ∑𝑓𝑖 𝑚𝑖 = 760

∑𝑓𝑖 𝑀𝑖 ∑𝑓𝑖 𝑀𝑖 760


𝜇= = = = 38
𝑁 𝑓𝑖 20
4. Weighted Arithmetic mean

∑𝑤𝑖 𝑥𝑖 2 × 30000 + 5 × 20000 + 10 × 10000


𝜇𝑤 = = = 𝑅𝑠. 15294.12
∑𝑤𝑖 2 + 5 + 10

Answers to check your Progress 2

1. (b)
2.
Class Frequency Cumulative Frequency

0-1 1 1

1-2 4 5

2-3 8 13

3-4 7 20

4-5 3 23

5-6 2 25

N/2=25/2=12.5, which falls in the class 2-3 , as the cumulative frequency of this class is 13.
Substituting the relevant values in the formula
𝑁 25
− 𝑐𝑓𝑝 −5
2 2
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + (𝑤) = 2 + × (1) = 2.9375
𝑓𝑚𝑑 8

Answers to check your progress 3

1. The modal class is 2-3, as the highest frequency is 10. Let’s substitute the values in the given
formula

𝑑1 = 𝑓1 − 𝑓0 = 10 − 6 = 4 𝑑2 = 𝑓1 − 𝑓2 = 10 − 9 = 1
𝑑1 4
𝑀𝑜𝑑𝑒 = 𝐿 + ( ) =2+( ) 1 = 2.8
𝑑1 + 𝑑2 4+1
Answers to check your Progress 4

1. Percentile
2. True
3. False
4. P30= 20 means 30% of the values are less than or equal to 20 and and 70 % are more than 20

Answers to check your Progress 5

1. (d)

2. (c)

2.10 Glossary
Arithmetic Mean: A measure of central tendency, computed by summing all the values and
dividing by the number of observations
Geometric Mean: A measure of central tendency used to measure the average rate of change or
growth for some quantity, computed by taking the nth root of the product of the values
representing change
Harmonic Mean:A measure of central tendency defined as the reciprocal of the arithmetic mean
of the reciprocals of the individual observations.
Median : The middle point of the data set the divides the data into two halves
Mode: The value most often repeated in the data set
Quartile: Fractiles that divide the data into four equal parts
Decile: Fractiles that divide the data into ten equal parts
PercentileFractiles that divide the data into hundred equal parts

2.11 Assignment

1. What do you mean by the property of central tendency?


2. What are the differences between mean, median and mode and their relative advantages
and disadvantages?
3. Explain symmetrical distribution. Discuss various types of skewed curves along with
relationship between mean, median and mode.
4. The following data represent the number of appointments made per hour in a hospital.
Calculate the mean, median , mode, quartiles and 90th percentile for the data ( ken
black,3.32)

Number of appointments: 0-1 1-2 2-3 3-4 4-5 5-6

Frequency: 31 57 26 14 6 3

2.12 Activities
Compare some small cap, mid cap and large cap mutual funds for 3 year and five year return on
the basis of measures of central tendency.

2.13 Case study


State bank of India sells insurance policy under the company name SBI life Insurance. The
approval process in life insurance consists of underwriting, which includes a review of
application, a medical information check, and possible requests for additional medical
information, medical checkups, and a policy compilation stage during which the policy pages are
generated and sent for approval. The ability to deliver approved policies to customers in a timely
manner is critical to the profitability of this service. During a period of one month, a random
sample of 27 approved policies is selected and the following processing time in days is recorded.
73 19 16 64 28 28 31 90 60 56 31 56
22 18 45 48 17 17 17 91 92 63 50 51
69 16 17
Using the concepts of measures of central tendency, compute the required statistics. What would
you tell a customer about how long the approval process takes?

2.14 Further reading


1. Applied Business Statistics, Ken Black, Wiley Publications.
1 Business Statistics, David Mlevine et al, Pearson Education
2 Statistics for management, Levin and Rubin, Pearson Education
3 Business Statistics by J.K. Sharma, Pearson Education
4 Business Statistics by Naval Bajpai, Pearson Education
Unit No. 3 Discrete Probability Distribution
_________________________________
Unit Structure

3.0 Learning Objectives

3.1 Introduction

3.2 Random Variable and Probability Distribution

3.3 Discrete Probability Distribution


3.3.1 Expected Value
3.3.2 Variance

3.4 Binomial Distribution

3.4.1 Using Binomial Distribution Table


3.4.2 Mean and Standard Deviation of Binomial Distribution

3.5 Poisson Distribution

3.5.1 Using Poisson Distribution Table

3.5.2 Mean and Standard Deviation of Poisson Distribution

3.6 Let Us SumUp


3.7 Answers for Check YourProgress

3.8 Glossary
3.9 Assignment
3.10 Activities
3.11 Case Study

3.12 Further Reading


3.0 Learning Objectives

After learning this unit, you will be able to :

• Understand the importance of probability distributions in decision making

• Explain random variable and its types

• Identify the various situations where discrete probability distributions can be applied.

• Understand Binomial distribution and its uses

• Explain Poisson distribution and its uses

3.1 Іntroductіon
Many times organizations are more interested in some function of the outcome of a process/
experiment than the actual outcome itself. For example road safety service may be interested to
know the probability of a particular number of accidents that could take place in a day rather
than the details of the accident itself. We recognize that this information on probability will be
very useful in taking decision. Let’s say, a manufacturer randomly selects two boxes from a
large batch of boxes to test its quality. Each selected box can be rated as good or defective. If
the boxes are numbered 1 and 2, a defective box is designated as D and good box is designated
with G. Then all the possible outcomes in the sample space are {D1G2, D1D2, G1G2,G1D2} .The
expression D1G2 means first is defective and second is of good quality. The possible outcomes
are getting zero, one or two good boxes.It can be observed the probability of getting one good
(2/4) is more than getting both (1/4). This representation of possible outcomes and their
probabilities is known as probability distribution. Development of probability theory helps in
specifying probability distributions. There are a number of theoretical probability distributions
that have been analyzed. Many real life situations could be approximated to these distributions
and used for decision making. We will be studying some common probability distributions in
this and the subsequent unit. The objective of this unit is to study one type of probability
distribution- i.e. discrete probability distribution. The basic concept and its application in
decision making will be discussed

3.2Probability distribution and Random Variable


In previous unit we described frequency distribution as a useful way of summarizing the
variations in the observed data. Frequency distributions are prepared by listing the possible
outcomes of an experiment and indicating the observed frequency of each possible outcome. A
probability distribution is a theoretical frequency distribution, which is based on expected
outcomes. A frequency distribution is a listing of the observed frequencies of all the
outcomesof an experiment that actually occurs when the experiment was done, whereas a
probability distribution is a listing of the probabilities of all the possible outcomes that could
result if the experiment were done
Consider an example: An educational institute is predicting, what will be the composition of
the new MBA batch on the basis of their stream of graduation based on their experience of
previous batches. Assume the students are from these streams:
Stream B.Com BBA BE/BCA others
Probability 0.40 0.30 0.10 0.20
The above data is based on expectations of the institute about the new batch and are prepared
before collecting any real data. This will be called probability distribution. However, once the
admissions are done and the distribution of actual data on the stream of graduation collected
will be called a frequency distribution.
A experiment is defined as any process that generates well defined outcomes. Let’s understand
the process of assigning numerical values to experimental outcomes. For any particular
experiment, a random variable can be defined in a way that each possible experimental
outcome generates exactly one numerical value for the random variable. For example if we
consider the experiment of cars arriving for repair work at an automobile service station, we
can describe the experimental outcomes in terms of numbers of cars arriving. In this case if x=
Number of cars arriving, x is called the random variable. The possible values that the random
variable ‘x’ can take are 0, 1, 2, 3, 4.,…..n cars.
A random variable is defined as a numerical description of the outcome of an experiment.
A random variable may be classified as either discrete or continuous, depending on the
numerical values they can assume. In the above example, the random variable can take only
discrete values. A random variable that may assume only a finite or countably infinite number
of possible numbers ( eg.x=0, 1, 2, 3, 4, 5…n) is a discrete random variable. In most
situations, discrete random variables produce values that are nonnegative whole numbers. For
example, if 10 people are selected from a population and how many are female is to be
determined, the random variable here is discrete. The only possible numbers of female in the
sample are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10. There cannot be 3.5 females in a group of 10
people; obtaining decimal values is impossible.The numbers of units sold, number of defective
parts, no of customers entering a bank, no of voters who voted in an areaetc,are some examples
of discrete random variables.
There are certain situations, in which the variable of interest can take infinitely many values.
Consider an example that a company is interested in ascertaining the probability distribution of
the volume of a 1000 ml bottle of soft drink, manufactured by it. The company have reasons to
believe that the packaging process is such that at times the volume may be slightly less than or
slightly more than 1000 ml. There is infinite number of values that the random variable
‘volume’ can take over a range. In such cases, it makes more sense in talking about probability
of volume between two values, rather than the probability of volume taking a specific value.
Random variables that may assume any value over a given interval are called continuous
random variable. It can be said that continuous random variables are generated from
experiments in which things are ‘measured’ and not ‘counted’. For example a worker can take
any value between a reasonable range to assemble a product component such was 3.5 minutes
(3 minute, 30 seconds). This means unlike discrete, continuous random variable can take
decimal values also. The weight, time, temperature, percentage of projects completed on time,
length of caretc are some examples of continuous discrete variable.
The outcomes for random variables and their associated probabilities can be organized into
distributions. The distributions constructed from discrete random variables are called discrete
probability distributions and the distributions constructed from continuous random variables
are called continuous probability distributions.

Check Your Progress 1


State whether following statements are True or False
1.Variables which take on values only at certain points over a given interval are called
continuous random variables__________
2.A variable that can take on values at any point over a given interval is called a discrete
random variable_________
3.The number of automobiles sold by a dealership in a day is an example of a discrete
random variable________
4.The amount of time a patient waits in a doctor's office is an example of a continuous
random variable________

3.3Describing a Discrete Probability Distribution

For example, the following data is the distribution of the number of loan approved per week at
the local branch office of a bank. The listing is collective exhaustive as all the possible
outcomes are listed and thus the probabilities must addupto 1.
X 0 1 2 3 4 5 6
P(x) 0.1 0.1 0.2 0.3 0.15 0.1 0.05

The given figure is a graphical representation of the data, with the values of the random variable
x shown on the horizontal axis. The probability that x takes on these values is shown on vertical
axis

0.35
0.3
0.25
0.2
P(x)

0.15
0.1
0.05
0
0 1 2 3 4 5 6
Loans per week
3.3.1Expected Value

After constructing the probability distribution for a random variable, we often want to calculate
the mean of the random variable. The mean µ of a probability distribution is the expected value
of a random variable. To calculate the expected value, you multiply each possible outcome x by
its corresponding probability P(x) and then add the resulting terms. The mathematical formula
for computing the expected value of a discrete random variable is

𝜇 = 𝐸(𝑥) = ∑ 𝑥𝑖 𝑃(𝑥𝑖 )

Where, 𝑥𝑖 = 𝑖 𝑡ℎ 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑥


𝑃(𝑥𝑖 ) = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑜𝑓 𝑥

Let’s find the expected value for the given probability distribution on the loan approved per
week using the formula.

No of loans per week( xi) P(xi) xi P(xi)


0 0.1 0.0
1 0.1 0.10
2 0.2 0.4
3 0.3 0.9
4 0.15 0.6
5 0.1 0.5
6 0.05 0.3
∑=1.00 µ=E(x)=2.8

𝜇 = 𝐸(𝑥) = ∑ 𝑥𝑖 𝑃(𝑥𝑖 )

= 2.8

The expected value of 2.8 represents the mean number of loans approved per week. For
experiments that can be repeated numerous times, the expected value can be interpreted as the
‘long run’ average value of the random variable. However it does not mean that the random
variable will assume this value, whenever next the experiment is conducted. In fact, it is
impossible to approve exactly 2.8 loans in any week. This value is important to a manager from
both the planning and decision making point of view. For example the company is interested to
know, how many loans will be approved in the next five weeks? Although we cannot specify the
exact number of loans approved in a week, based on the expected value of 2.8 loans per week,
we can say that the average number of loans approved in the next month will be 14 (2.8x5). In
terms of setting targets or allocating work, the expected value may provide helpful decision
making information.

3.3.2 Variance and Standard Deviation of a Discrete Distribution


The expected value gives us an idea of the average or central value for the random variable, but
often we want to measure variability of the possible values of random variable. The variance is a
commonly used measure to summarize the variability in the values of the random variable. The
variance of the probability distribution can be computed by multiplying each possible squared
difference (xi-µ)2 by its corresponding probability and then summing the resulting values. The
mathematical expression for the variance of the discrete variable is:
𝜎 2 = ∑(𝑥𝑖 − 𝜇)2 𝑃(𝑥𝑖 )

The standard deviations can be computed using the formula

𝜎 = √𝜎 2 = ∑(𝑥𝑖 − 𝜇)2 𝑃(𝑥𝑖 )

No of loans per week( xi) P(xi) xi P(xi) (xi-µ)2P(xi)


0 0.1 0.0 (0-2.8)2(0.10)=0.784
1 0.1 0.10 (1-2.8)2(0.10)=0.324
2 0.2 0.4 (2-2.8)2(0.20)=0.128
3 0.3 0.9 (3-2.8)2(0.30)=0.012
4 0.15 0.6 (4-2.8)2(0.15)=0.216
5 0.1 0.5 (5-2.8)2(0.10)=0.484
6 0.05 0.3 (6-2.8)2(0.05)=0.512
∑=1.00 µ=E(x)=2.8 σ2=2.46

𝜎 = √𝜎 2 = ∑(𝑥𝑖 − 𝜇)2 𝑃(𝑥𝑖 )

= √2.46 = 1.57
The variance of the number of loans approved per week is 2.46. For the purpose of easier
managerial interpretation, the standard deviation may be preferred over the variance, as it is
measured in the same units as the random variable. The variance (σ2) is measured in squared
units and is thus more difficult for a manger to interpret. The utility of variance and standard
deviation is limited to comparisons of variability of different random variables. For example, the
number of loans approved by two credit risk managers can be compared for variability.

There are many discrete probability distributions, but in this unit, we will be discussing two
types of discrete distribution- Binomial distribution and Poisson distribution.
Check Your Progress 2
1. The mean or the expected value of a discrete distribution is the long-run average of
the occurrences.( True/False)
2. To compute the variance of a discrete distribution, it is not necessary to know the
mean of the distribution.(True/ False)
3. You are offered an investment opportunity. Its outcomes and probabilities are
presented in the following table. B
X P(x)
-$1,000 .40
$0 .20
+$1,000 .40
The mean of this distribution is _____________.
a) -$400
b) $0
c) $200
d) $400
4. You are offered an investment opportunity. Its outcomes and probabilities are
presented in the following table. D
X P(x)
-$1,000 .40
$0 .20
+$1,000 .40
5.The standard deviation of this distribution is _____________.
a) -$400
b) $663
c) $800,000
d) $894

3.4 Binomial Distribution


The most widely used of all discrete distribution is the binomial distribution. Several
assumptions underlie the use of the binomial distribution are:
• The experiment consists of a sequence of n identical trials
• Each trial has only two possible outcomes denoted as success and failure
• Each trial is independent of previous trials
• Probabilities of the two outcomes remain constant throughout the experiment

As the word binomial suggest, any single trial of a binomial experiment contains only two
possible outcomes. The two outcomes are labeled success or failure. The outcome of interest to
the researcher is usually labeled as success. The symbol ‘p’ represents the probability of success
of a trial and the symbol ‘q’ is the probability of failure of a trial. Let ‘x’ denote the value of the
random variable, then x can have a value of 0, 1, 2, 3…..n, depending on the number of success
observed in n trials. The mathematical formula for computing the probability of any value for the
random variable, where binomial distribution is applicable is:
𝑛!
𝑃(𝑥) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
where n= no of trials
x= no of successes desired
p= Probability getting success in one trial
q=1-p= the probability of failure in one trial
To illustrate the binomial probability distribution, let us consider the experiment of entering a toy
store. To keep the problem relatively small, we restrict the experiment to next five customers.
Based on experience, the store owner estimates that the probability of a customer making a
purchase is 0.30, what is the probability that exactly three of the next five customers make a
purchase?
Let’s check the assumptions of binomial experiment:

1. The experiment is described as sequence of five identical trials, one trial each for the
five customers entering the store
2. Each trial has only two possible outcomes- customer making a purchase( success) and
customer does not make a purchase(failure)
3. The purchase decision of one customer is independent of other trial is independent of
previous trials
4. Probabilities of purchase p=0.30 and no purchase q=0.70, remains constant throughout
the experiment
The random variable ‘x’ is defined as number of customers making a purchase. With n=5 trials,
p=0.30 , q=0.70, the probability that exactly 3 customers out of five make a purchase can be
computed using the formula:
𝑛!
𝑃(𝑥 = 3) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
5!
= 5 𝐶3 0.303 0.705−3 = × 0.303 0.705−3
3! (5 − 3)!
= 0.1323
Similarly, we can find the probability of zero( x=0) customers making a purchase
5!
𝑃(𝑥 = 0) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = 5 𝐶0 0.300 0.705−0 = × 0.300 0.705−0 = 0.1681
0! (5 − 0)!
If we are interested in computing the probability of at least 3 customers making a purchase, we
need to find the probabilities of P(x=0), P(x=1), P(x=2) and P( x=3) and then sum it up. In the
next section, we will be discussing the use of tables to directly get the probability values.

3.4.1 Using Binomial Table

Binomial distributions are a family of distributions. Every different value of n and/or every
different value of p gives a different binomial distribution and tables are available for various
combinations of n and p values. Such a table for binomial probability values is provided in
Appendix Statistical Table A. In order to use this table, we need to specify values of n, p and x
for the binomial experiment. Each table is headed by a value of n. Eleven values of p are
presented in each table of size n. The column below each value of p is the binomial distribution
for that combination of n and p.
To illustrate the use of Binomial tables, let’s take an example. ABC resources, publishes data on
market share for various product categories in FMCG. As per the latest report, Oreo controls 10
% of the market of cookies brand. Suppose 20 purchasers are selected randomly from the
population. What is the probability that fewer than four purchasers choose Oreo?

For this problem n=20 p=0.10 and x=4. The portion of binomial tables under n=20 can be used
to find the probability values. Search along the p values for 0.10. Determining the probability of
getting x=4 involves adding the probabilities for x=0, 1, 2 and 3. The values appear in the x
column of the intersection of each x value and p=0.10.

x value Probability
0 0.122
1 0.270
2 0.285
3 0.190
∑= 0.867

P(x<4)=0.867. If 10 % of all cookie purchasers prefer Oreos and 20 cookies purchasers are
randomly selected, about 86.7 % of the time, fewer than four of the 20 will select Oreos

3.4.2 Mean and Standard Deviation of Binomial Distribution


A binomial distribution has an expected value or a long run average, which is denoted by µ. The
expected value means that if n items are sampled over and over for a long time and if p is the
probability of getting success on one trial, the average number of success per sample is expected
to be np.
𝜇 = 𝑛𝑝

Let’s say, according to a study, 64% of all consumers believe that public sector banks are more
competitive than five years ago. If 25 consumers are selected randomly, what is the expected
number who believe that public sector banks are more competitive than they were five years
ago?
This problem can be described by the binomial distribution of n=25 and p=0.64. The mean of
this problem can b computed as:
𝜇 = 𝑛. 𝑝 = 25 × 0.64 = 16

It means in long run, if 25 consumers are selected randomly again and again and if 64% of the
consumers believe in the given statement, then on an average 16 out of 25 will believe that
public sector banks are more competitive than five years ago.

The standard deviation of the binomial distribution is denoted as ‘σ’ and is computed using the
following formula:
𝜎 = √𝑛. 𝑝. 𝑞
For the given data, the standard deviation is
𝜎 = √25 × 0.64 × 0.36 = 2.4
Check Your Progress 3
1. The distribution that deals only in success and failures is referred to as the ________
2. If x is a binomial random variable with n=8 and p=0.6, the mean value of x is _____
a) 6
b) 4.8
c) 3.2
d) 8
3. If x is a binomial random variable with n=8 and p=0.6, the standard deviation of x is
a) 4.8
b) 3.2
c) 1.92
d) 1.39
4. If x is a binomial random variable with n=8 and p=0.6, what is the probability that x
is equal to 4?
a) 0.500
b) 0.005
c) 0.124
d) 0.232

3.5Poisson Distribution
The binomial distributiondescribes a distribution of two possible outcomes from a given
number of trials. The Poisson distribution focuses only on the number of discrete occurrences
over some interval. A Poisson experiment does not have a given number of trials (n) as the
Binomial distribution. The Poisson distribution has the following characteristics:
• It describes discrete occurrences over a continuum
• Each occurrence is independent of the other occurrences
• The occurrence in each interval can range from zero to infinity
• The expected number of occurrences must remain constant throughout the experiment

This distribution was used initially to describe occurrence of rare events for some interval .
Some of the common examples where Poisson random variable can be used to define are-
number of accidents per day, number of earthquakes occurring over a time period, no of
misprints on a page, number of interruptions per minute on a server, number of arrivals at a
tollbooth etc.
If a Poisson distributed phenomenon is studies for a long period of time, a long run average can
be determined. This average is denoted as lambda(ʎ)andis used to describe Poisson
Distribution. The Poisson formula used to compute the probability of occurrences over an
interval for a given lambda value is:
𝜆𝑥 𝑒 −𝜆
𝑃(𝑥) =
𝑥!
Where, x=0, 1, 2, 3……..
ʎ= long run average
e=2.7182
Her x is the number of occurrences per interval for which the probability is to be computed. The
ʎ value must remain constant throughout the Poisson experiment.

Suppose that we are interested in the number of arrivals at a bank window during a 10 minute
period on weekday mornings. We assume that the arrival of one customer is independent of
arrival of the other. Based on the historical data it is found that the average number of
customers arriving during a 10 minute interval of time is 8. If we want to find the probability of
arrival of five customers in 10 minutes, we would use x=5, ʎ= 8 per 10 minutes and compute:
𝜆𝑥 𝑒 −𝜆 85 × 2.71828−8
𝑃(𝑥) = = = 0.0916
𝑥! 5!
Suppose we want to find the probability for 9 customers arriving in twenty minutes. We need to
note that there is a change in the interval, instead of 10 minutes, the probability is to be found
for 20 minutes. As per the ʎ value, on an average 8 customers are arriving in 10 minutes. We
can derive the new average rate for 20 Minutes by multiplying ʎ by 2, i.e 16 customers per 20
minutes. To compute we would usex=9, ʎ= 16 per 20 minutes:
𝜆𝑥 𝑒 −𝜆 169 × 2.71828−16
𝑃(𝑥) = = = 0.0213
𝑥! 9!
The probability of 9 customers arriving in twenty minutes duration is 0.0213. Similarly, if we
want to find probability of a x value for 5minutes, the lambda value will be 4 customer per five
minutes. If we want to find cumulative probabilities like less than 8 customers, we need to find
various probabilities (for x=0, 1, 2, 3,4,5,6,7,) and then add it up. However in this case, it will
be easier to use Poisson tables.

3.5.1 Using Poisson Tables

Every value of Lambda determines different Poisson distribution. Regardless of the nature of
interval associated with the lambda, The Poisson distribution for a particular lambda is same.
Table B, contains the Poisson distribution for the selected value of lambda. Probabilities for
each x value associated with a given lambda are displayed, if it has a nonzero probability value
in the table.
Let’s illustrate the use of Poisson table for the given problem. The number of faults per month
that arise in the gearboxes of travel buses is known to follow a Poisson distribution with a mean
of 2.5 faults per month. What is the probability that in a given month less than 3 faults are
found?

For this problem ʎ=2.5 faults per month and x=0, 1, 2. The portion of Poisson tables under
ʎ=2.5 can be used to find the probability values. The values appear in the x column of the
intersection of each x value and ʎ=2.5.

x value Probability
0 0.0821
1 0.2052
2 0.2565
∑= 0.5438
The probability that in a given month less than three faults are found is 0.5438
3.5.2 Mean and Standard deviation of Poisson Distribution
The mean or expected value of a Poisson distribution is ʎ. It is the long-run average of
occurrences over a interval if many samples are taken over time. Lambda is usually not a whole
number, so most of the time it is impossible to actually observe lambda occurrence in an interval.
For example lambda is 4.5 /interval for a Poisson distribution. Let’s say, a random sample of 20
resulted in the given x occurrences for the time interval:
3, 4, 7, 6, 5, 4, 3, 4, 5, 6, 4, 5, 3, 4, 5, 6, 7, 5, 3, 4
Computing the average for this data gives 4.65, however for infinite sampling the long run
average ʎ is 4.5/interval. We can note that when ʎ is 4.5, most of the values lie between 4 and 5
and rarely there will be values like 1,2, 10,11……. Thus understanding the mean of the Poisson
distribution gives a feel for the actual occurrences that are likely to happen.

The variance of a Poisson distribution is also ʎ. The standard deviation is √𝜆.

Check Your Progress 4


1. The mean number of occurrences per interval of a Poisson distribution is denoted
by___
2. For a Poisson distribution the standard deviation is calculated as_________
3. The number of cars arriving at a toll booth in five-minute intervals is Poisson
distributed with a mean of 3 cars arriving in five-minute time intervals. The
probability of 3 cars arriving over a five-minute interval is _______.
a) 0.2700
b) 0.0498
c) 0.2240
d) 0.0001
4. A service station has a pump that distributes petrol fuel to automobiles. It is
estimated that 7 cars use the petrol pump every 2 hours. Assuming the arrivals are
Poisson distributed, what is the probability that atleast three cars will arrive to use
the diesel pump during one hour period?

3.6Let Us Sum Up
Probability experiments produce random occurrence. A variable that contains the outcomes of a
random experiment is called a random variable. A random variable that may assume only a finite
or count ably infinite number of possible numbers is a discrete random variable. Random
variables that may assume any value over a given interval are called continuous random variable.
Discrete distributions are constructed from discrete random variable. Continuous distributions
are constructed from continuous random variable.We have looked into situations which give rise
to discrete distributions and how it can be helpful in decision making. We have discussed two
types of discrete distributions- Binomial and Poisson distribution. The concept of expected value
and standard deviation is discussed with its interpretation. The binomial distribution fits
experiments when only two outcomes are possible. The Poisson distribution pertains to
occurrences over some interval.The assumptions are that each occurrence is independent of other
occurrences and that the value of lambda remains constant over the period of time.
3.7Answers for Check Your Progress

Answers to check your progress 1


1. False
2. False
3. True
4. True

Answers to check your progress 2


1. True
2. False
3. (b)
4. (d)

Answers to check your progress 3

1. Binomial distribution
2. (b)
3. (d)
4. (d)

Answers to check your progress 4

1. ʎ
2. 𝜎 = √𝜆
3. (c)
4. P(x>=3), ʎ= 7 cars per two hours, so for one hour interval ʎ= 3.5 cars.
P(x>=3)= 1- {P(x=0)+P(x=1) +P(x=2)
x value Probability
0 0.0302
1 0.1057
2 0.1850
∑= 0.3209
P(x>=3)= 1- {0.0302+0.1057 +0.1850)= 1- 0.3209= 0.6791

3.8Glossary
Probability distribution:A list of outcomes of an experiment with the probabilities associated
with these outcomes
Random Variable: A variable that takes on different values as a result of outcomes of a
random experiment.
Discrete random variable: A random variable that is allowed to take countable infinite or
finite number of values.
Continuous random variable: A probability distribution in which the variable is allowed to
take on any value within a given range.
Discrete probability distribution: A probability distribution of discrete random variable is
called discrete probability distribution.
Continuous probability distribution: A probability distribution of continuous random variable
is known as continuous probability distribution
Expected value: A weighted average of the outcomes of an experiment.
Binomial distribution: The probability distribution for a discrete probability distribution, used
to compute the probability of x success in n trials
Poisson distribution: The probability distribution for a discrete probability distribution, used to
compute the probability of xoccurrences over a specified interval.

3.9Assignment
1. What is meaning of expected value of a probability distribution?
2. What are the assumptions of a Binomial distribution?
3. What are the characteristics of a Poisson distribution?
4. A survey conducted for an insurance company revealed that 70% of workers say job stress
caused frequent health problems. Suppose a random sample 10 workers is selected. What is
the probability that more than seven of them say job stress caused frequent health problems?
What is the expected number of workers who say job stress caused frequent health
problems?
5. A survey conducted by the Consumer research centre reported that among other things that
women spend an average 1.2 hours per week on shopping online. Assume that hours per
week shopping online are Poisson distributed. If the survey result is true for all women and
if a woman is randomly selected, what is the probability that she did not shop at all online
over a one week period? What is the probability that a women would shop three or more
hours online during a one week period?

3.10 Activities
Develop graphs for binomial distribution using the tables for n= 8 and(a) p=0.20, (b) p=0.50
and (c) p=0.80 and comment on the shape of the three graphs

3.11 Case Study

Starting a business entails understanding and dealing with many issues—legal, financing, sales
and marketing, intellectual property protection, liability protection, human resources, and
more. The interest in entrepreneurship is at an all-time high. And there have been spectacular
success stories of early stage startups growing to be multi-billion-dollar companies, such as
Uber, Facebook, WhatsApp, Airbnb, and many others.Starting a business is a huge commitment.
Entrepreneurs often fail to appreciate the significant amount of time, resources, and energy needed to
start and grow a business.

A survey was done to identify the most important advice for starting a business venture. A
random sample of 12 small business owners, are contacted and data was collected. As per the
survey, 20 % of all small business owners say the most important advice for starting a business is
to prepare for long hours and hard work. Twenty five percent say the most important advice is to
have good financing ready. Nineteen percent say having a good plan is the most important
advice, 18 % say studying the industry and industry knowledge is the most important advice and
18% list other advice.
Questions
1. What is the probability that six or more owners would say preparing for long hours and
hard work is the most important advice?
2. What is the probability that exactly five owners would say having food financing ready is
the most important advice?
3. What is the expected number of owners who would say having a good plan is the most
important advice?

3.12 Further Reading

2. Applied Business Statistics, Ken Black, Wiley Publications.


4 Business Statistics, David M. levine et al, Pearson Education
5 Statistics for management, Levin and Rubin, Pearson Education
6 Business Statistics by J.K. Sharma, Pearson Education
7 Business Statistics by Naval Bajpai, Pearson Education
Unit No.4 Continuous Probability Distribution
_________________________________
Unit Structure

4.0 Learning Objectives


4.1 Introduction
4.2 Continuous Probability Distribution
4.3 Uniform Distribution

4.3.1 Area as a Measure of probability

4.4 Normal Distribution

4.4.1 Probability density Function and its characteristics

4.4.2 Standard Normal Probability Distribution Table

4.4.3 Solving Normal Distribution Problems

4.4.4 Normal as an approximation of Binomial

4.5 Exponential Distribution

4.5.1 Probabilities of Exponential Distribution

4.6 Let us Sum up

4.7 Answers for Check your Progress


4.8 Glossary

4.9 Assignment

4.10 Activities

4.11 Case Study

4.12 Further Readings


4.0Learning Objectives
After learning this unit, you will be able to:

• Understand the importance of continuous probability distributions in decision making

• Identify the situations where continuous probability distributions can be applied

• Explain Uniform distribution and its application

• Understand Normal distribution and its application

• Explain Exponential distribution and its application

4.1 Introduction
In the last unit, we discussed situations involving discrete random variable and the resulting
discrete probability distributions. In this unit we will be focusing on random variable which can
take any value over a range. Suppose you are a website designer for a matrimonial site and you
have to make sure that the webpage downloads quickly. The download time is affected by design
of the website and the load on the company’s web server. The random variable ‘download time’
is a continuous variable, as it can take any value on a range and not just whole number. This type
of random variable which can take infinite number of values over a range is called a continuous
random variable and the probability distribution of such variable is called continuous probability
distribution. The concepts and assumptions for this type of distributions is quite different from
those of discrete probability distributions. The objective of this unit is to study the concepts and
usefulness of continuous distribution. We will be discussing some important continuous
probability distributions and their applications in this unit.

4.2 Continuous probability distributions


Continuous distributions are constructed from continuous random variables in which values are
taken on for every point over a given interval and are usually generated from experiments in
which things are ‘measure’ as opposed to ‘counted’ as in discrete distributions. With continuous
distributions, probabilities of outcomes occurring between particular points are determined by
calculating tea ea under the curve between those points. In addition the entire area under the
whole curve is equal to 1.Various continuous distributions include the uniform distribution, the
normal distribution, the exponential distribution, the t distribution, the Chi-square distribution
and the F distribution. In this unit we will discuss the uniform distribution, the normal
distribution, the exponential distribution.

The figure 1 graphically represents three continuous distributions. Figure 1(a) depicts a uniform
distribution, where the probability of occurrence of a value is equally likely to occur anywhere in
the range between the smallest vale ‘a’ and the largest value ‘b’. Sometimes referred to as
uniform distribution, the uniform distribution is symmetric, meaning its mean equals its median.
Figure 1 (a) Uniform Distribution (b) Normal Distribution (c) Exponential Distribution

Figure 1(b) depicts a normal distribution. The normal distribution is symmetrical and bell
shaped, so most of the values group around the mean. The mean median and mode all have the
same value. An exponential distribution is illustrated in Figure 1(c).An exponential distribution,
is a positively skewed distribution, which makes the mean larger than the median. The range for
an exponential distribution is zero to positive infinity, but its shape makes it highly unlikely for
extremely large values to occur.

Check your Progress 1


1. The probability of occurrences remain constant in uniform distribution ( True/ False)
2. The exponential distribution is a
a. Positively skewed curve
b. Normal Curve
c. Negatively skewed curve
d. Symmetric curve

4.3Uniform Distribution

Uniform distribution refers to a probability distribution for which all of the values that a random
variable can take on occur with equal probability in the range between the smallest value ‘a’ and
the largest value ‘b’. Suppose, the travel time of buses travelling from city X to city Y is denoted
by x. Assume that the minimum time is 3 hours and the maximum time is 3 hours 20 minutes.
Thus in terms of minutes the travel time can be any interval between 180 and 200 minutes. As
the random variable x can take any value between 180 and 200 minutes, x is a continuous
variable. Based on the past data, the probability of flight time between 180 and 181 minutes is
same as the probability of travel time between any other 1 minute interval up to and including
200 minutes. With every interval being equally likely, the random variable x has a uniform
distribution. The following probability density function defines a uniform distribution:

1
𝑓𝑜𝑟 𝑎 ≤ 𝑥 ≤ 𝑏
𝑓(𝑥) = {𝑏 − 𝑎 }
0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑜𝑡ℎ𝑒𝑟 𝑣𝑎𝑙𝑢𝑒𝑠
In a uniform distribution, the total area under the curve is 1 and as the shape is rectangular the
area can be computed as the product of length and width of the rectangle. Because, by definition,
the distribution lies between the x values of a and b, the length of the rectangle is (b-a).
Combining this with the fact that area under the curve is equal to 1, height of the rectangle can be
solved as follows:
Area of Rectangle= Length x height=1, but length= (b-a)

Therefore (𝑏 − 𝑎) × ℎ𝑒𝑖𝑔ℎ𝑡 = 1
1
𝐻𝑒𝑖𝑔ℎ𝑡 =
(𝑏 − 𝑎)

The mean and the standard deviation of the uniform distribution are given as follows:
𝑎+𝑏
𝜇=
2
𝑏−𝑎
𝜎=
√12
As an example, suppose a production line manufactures a machine part in lots of 10 per minute
during a shift. When the lots are weighed, variation in weights was observed in the range of 34 to
48 grams in a uniform distribution. The height of the distribution is:

1 1 1
𝐻𝑒𝑖𝑔ℎ𝑡 = = =
(𝑏 − 𝑎) 48 − 34 14

The mean and the standard deviation of the uniform distribution are given as follows:

𝑎 + 𝑏 48 + 34 82
𝜇= = = = 41
2 2 2
𝑏 − 𝑎 48 − 34 14
𝜎= = = = 4.041
√12 √12 3.464

4.3.1 Area as a measure of probability


As discussed earlier, for continuous distribution, the probabilities are calculated by determining
the area over an interval of function. With continuous distribution, there is no area under the
curve for a single point. The following formula is used to determine the probabilities of value
between x1 and x2 for a uniform distribution:
𝑥2 − 𝑥1
𝑃(𝑥) =
𝑏−𝑎

where, 𝑎 ≤ 𝑥1 ≤ 𝑥2 ≤ 𝑏

The probability of 𝑥 ≥ 𝑏 or 𝑥 ≤ 𝑎 is zero because there is no area above b and below a

Suppose for the same problem given above, we are interested to find the probability that the lot
weighs between 40 and 45 grams. The probability can be calculated as:
𝑥2 − 𝑥1 45 − 40
𝑃(𝑥) = = = 0.3571
𝑏−𝑎 48 − 34
So the probability that the lot weights between 40 and 45 grams is 0.3571. The probability that
the lot weight is less than 34 is zero, as the lowest value is 34. Similarly the probability that the
lot weight is more than 50 is also zero, as the upper value is 48.

Let’s find the probability that the lot weighs less than 40. As the lowest value is 34, for finding
the probability that lot weighs being less than 40 actually means values between 34 and 40grams.
So the probability is calculated as follows:
𝑥2 − 𝑥1 40 − 34
𝑃(𝑥) = = = 0.4286
𝑏−𝑎 48 − 34

Check your progress 2


1. A uniform continuous distribution is also referred to as a rectangular distribution.
(True/False)
2. If x is uniformly distributed over the interval 8 to 12, inclusively (8 x 12), then
the mean of this distribution is __________________.
a) 10
b) 20
c) 5
d) 0
3. If x is uniformly distributed over the interval 8 to 12, inclusively (8 x 12), then
the standard deviation of this distribution is __________________.
a) 4.00
b) 1.33
c) 1.15
d) 2.00
4. If x is uniformly distributed over the interval 8 to 12, inclusively (8 x 12), then
the probability, P(13 x 15), is __________________.
a) 0.250
b) 0.500
c) 0.375
d) 0.000
5. If x is uniformly distributed over the interval 8 to 12, inclusively (8 x 12), then
the probability, P(9 x 11), is __________________.
a) 0.250
b) 0.500
c) 0.333
d) 0.750

4.4 Normal Distribution

A very important continuous probability distribution is the normal distribution. There are many
reasons for normal distribution’s versatility and prominent place in statistics. First, it has
properties that make it applicable to many situations in which it is necessary to make inferences
by taking samples. Quite often, we face the problem of limited data for making inferences about
processes. Irrespective of the shape of the distribution of population, it has been found that
normal distribution can be used to characterize sampling distributions. This helps considerably in
inferential statistics. Second, the normal distribution is similar to actual frequency distribution of
many phenomena, like human characteristics (weight, height, IQ), outputs from physical
processes (dimensions and yield) and other measures of interest to managers. This knowledge
helps us to calculate probabilities of different events in varied situations and which in turn help
us in decision making. Finally, the normal distribution can be used to approximate certain
probability distributions, which helps considerably in simplifying probability calculations.

4.4.1Probability density function and its characteristics

The normal distribution has following characteristic:


• It is a symmetrical distribution about its mean
• The two tails of the normal distribution extend indefinitely and never touches the
horizontal axis
• The curve has a single peak, i.e it is unimodal.
• The median and mode also lie at the centre, thus for a normal curve mean, median and
mode are the same value.
• It is a family of curves

The normal distribution is described by two parameters: the mean π and standard deviation σ.
The density function of the normal distribution is:
1 𝑥−𝜇 2
1 − ( )
𝑓(𝑥) = 𝑒 2 𝜎
𝜎 √2𝜋
Where µ= mean of x
Σ= standard deviation of x
π=3.14159
e= 2.71828
Using calculus to determine areas under the normal curve from this function is difficult and time
consuming, therefore all researchers use table values to analyse normal distribution problems

4.4.2 Standard Normal Probability Distribution Table

Every unique pair of µ and σ values defines a different normal distribution. This characteristic of
being a family of curves could make analysis tedious, because of the volumes of normal curve
tables – one for each combination of µ and σ would be required. A mechanism was developed by
which all normal distributions can be converted into a single distribution (z distribution).This
process yields standardized normal distribution. The conversion formula for any value of x of a
given normal distribution is as follows:
𝑥−𝜇
𝑧= 𝑤ℎ𝑒𝑟𝑒 𝜎 ≠ 0
𝜎

A zscore is the number of standard deviations that a value, x, is above or below the mean. If the
value of x is less than the mean, the zscore is negative;If the value of x is more than the mean,
the zscore is positive;and if the value of x is equal to mean, the zscore is zero. This formula
converts the distance from mean into standard deviation units. A standard zdistribution table can
be used to find probabilities for any normal curve value that is converted to zscore.
Thezdistributionis a normal distribution with a mean of 0 and standard deviation of 1. Any
value of x at the mean is zero standard deviation from the mean. Any value of x that is one
standard deviation above or below the mean has a z value of 1. As per the empirical rule, in a
normal distribution regardless of the values of µ and σ, 68 % of all values are within one
standard deviation of the mean; 95%of all values are within one standard deviation of the mean;
and 99.7of all values are within three standard deviation of the mean. The z distribution
probability values are given in Appendix Statistical Table C. The Table C gives the total area
between 0 and any point on the positive z axis. Since the curve is symmetric, the area between z
and 0 is the same, irrespective of whether z is positive or negative. The table areas or
probabilities are always positive.

To use Z Table to find probabilities, first note that values of z appear in the left hand column,
with the second decimal value of z appearing in the top row.. For example for a value of 1.00, we
find the 1.0 in the left hand column and 0.00 in the top row. Then by looking into the body of the
table, we find that 0.3413 correspond to the 1.00 value of z. The value of 0.3413 is the area under
the curve between the mean (z=0) and z=1.00, as shown graphically in Figure 2.
Figure 2
Area or probability of 0.3413

z=0 z=+1

Suppose we want to find the probability of obtaining a z value between z=-1.00 and z=1.00. We
already know that the probability value of a z value between z=0.00 and z=1.00 is 0.3413. As the
normal distribution is symmetrical, i.e. the shape of the curve on the left of the mean is a mirror
image of the shape of the curve on the right of the mean Thus the probability of a z value
between z=0.00 and z=-1.00 is same as that probability of a z value between z=0.00 and z=1.00,
i.e 0.3413. Hence the probability between z=-1 and z=1.00 is 0.3413 + 0.3413= 0.6826, as shown
graphically.
Figure 3
Area or probability of 0.6826

z=-1 z=+1
4.4.3 Solving Normal distribution problems

Suppose that the Ceattyre company just developed a new radial tire that will be sold through a
national chain of stores. Because the tyre is a new product, the management believes that the
mileage guarantee offered with the tyre will be an important factor in the consumer acceptance
of the product. Before finalizing the tyre’s mileage guarantee policy, Ceat management wants
some probability information concerning the number of miles the tires will last.

From actual road test with the tires, the engineering department estimates the mean tyre mileage
to be 36500 miles and the standard deviation to be 5000 miles. In addition the data collected
indicate that a normal distribution is a reasonable assumption. What percentage of the tires can
be expected to last more than 40000 miles?

To compute the probability, we need to first find the z score:

𝑥 − 𝜇 40000 − 36500
𝑧= = = 0.70
𝜎 5000
Probability that x exceeds 40000

σ=5000

µ= 36500 40000
Thus the probability that the normal distribution for tyre mileage will have x values greater than
40000 is the same as the probability that the z distribution will have a z value greater than 0.70.
Using Z Table, we find that the area corresponding to z=0.70 is 0.2580. But we need to
remember that the table provides area between the mean and the z value. Thus we know, that
there is a 0.2580 area between mean and z=0.70. The total are under the curve is 1, being a
symmetrical curve, the area from mean to the tail will be 0.5.Thus the area above z=0.70 will be
0.5-0.2580=0.2420. In terms of tyre mileage x, we can conclude that there is a 0.2420 probability
that x value will be above 40000. Thus about 24.2 % of the tires manufactured can b expected to
last more than 40000.

Let us now assume that the company is considering that it will provide a discount on new set of
tires if the mileage on the original tires does not exceed the mileage stated on the guarantee.
What should be the guarantee mileage be, if Ceat wants no more than 8% of the tires to be
eligible for the discount?

Let’s first interpret the problem graphically

8 % of the tires
σ =5000

X value? µ=36500

Note that 8 % of the area is below that unknown guarantee mileage that we need to calculate. It
means the area between the men and the unknown guarantee value is 0.5-0.08=0.42. The
question is how many standard deviation(z value) do we have to be below the mean to get 42 %
of area? We have earlier used the z Table to find the area using a z value. Now we have area
between the mean and the z value, and need to find the corresponding z value. If we look for 0.42
in the body of the z Table, we see that a 0.4200 area occurs at approximately z = 1.41. As the
area is below the mean the z value of interest must be -1.41. Hence the desired guarantee mileage
should be 1.41 standard deviations less than the mean. Putting the known values in the formula,

𝑥−𝜇
𝑧=
𝜎

𝑥 − 36500
−1.41 =
5000

So 𝑥 = 36500 − 1.41(5000) = 29450

Therefore a guarantee of 29450 miles will meet the requirement that approximately 8 % of the
tires will be eligible for the discount. With this information the firm might confidently take the
decision to set its tyre mileage at 29000 miles. Again we see the important role of probability
distributions in providing information for decision making.
4.4.4. Normal as an approximation of Binomial

As the sample sizes become large, binomial distribution approaches normal distribution,
regardless of the value of p. This phenomenon occurs faster (for smaller values of n) when p is
near 0.50.To work a binomial problem by the normal curve requires a transformation process.
The first part is to convert the two parameters of binomial distribution-n and p, to the two
parameters of the normal distribution, µ and σ. It involves following formula:

𝜇 = 𝑛. 𝑝 𝑎𝑛𝑑 𝜎 = √𝑛. 𝑝. 𝑞

Suppose we want to find the probability that random variable x value lie between 20 and 24,
when a sample of 60 is taken and the probability of success is found to be 0.60. From the
previous unit we know that this can be calculated using the formula:

𝑛!
𝑃(𝑥) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!

We need to calculate P(x) for x=20, 21, 22, 23, and 24 and then sum it up to get the probability,
which is going to be very tedious. Translating from a binomial problem to a normal curve
problem gives:
𝜇 = 𝑛. 𝑝 = 60(0.30) = 18 𝑎𝑛𝑑 𝜎 = √𝑛. 𝑝. 𝑞 = 3.55

As binomial is a discrete distribution and normal is a continuous distribution, we need to use


correction for continuity for better approximation. A correction of +0.50, or -0.50 or ±0.50 is
required to be used, depending on the problem. A rule of thumb for Correction of continuity is
given in the table below:
Values being Correction Values being Correction
determined determined
x> +0.50 ≤x≤ -0.50 and +0.50
x≥ -0.50 <x< +0.50 and -0.50
x< -0.50 x= -0.50 and +0.50
x≤ +0.50
As we are interested in calculating probabilities between 20 and 24(both inclusive), after
applying correction for continuity, we will be finding area between 19.50 and 24.50.

σ =3.55

µ=1819.50 24.50
For x=19.50

𝑥 − 𝜇 19.50 − 18
𝑧= = = 0.43
𝜎 3.55
For x=25.50

𝑥 − 𝜇 24.50 − 18
𝑧= = = 1.83
𝜎 3.55

From the z Table we can find that, probability of z value =0.43 is 0.1664. This value is the area
between the mean and the z value. Similarly for z=1.83, the p value is 0.4664. To find the area
required probability, we need to subtract the two probability values:

P19.50 ≤ x ≤ 24.50) = 0.4664 − 0.1664 = 0.30

Thus the probability that the value will fall between 19.5 and 24.50 is 030. You may check the
value by using the binomial distribution formula; the answer will be the same.

Check your progress 3


1. A z-score is the number of standard deviations that a value of a random variable is
above or below the mean. (T/F)
2. Since a normal distribution curve extends from minus infinity to plus infinity, the area
under the curve is infinity.(T/F)
3. A standard normal distribution has a mean of zero and a standard deviation of one.(T/F)
4. The area to the left of the mean in any normal distribution is equal to _______.
a) the mean
b) 1
c) the variance
d) 0.5
5.If x is a normal random variable with mean 80 and standard deviation 5, the z-score for x
= 88 is ________.
a) 1.8
b) -1.8
c) 1.6
d) -1.6
6. Suppose x is a normal random variable with mean 60 and standard deviation 2. A z
score was calculated for a number, and the z score is 3.4. What is x?
a) 63.4
b) 56.6
c) 68.6
d) 66.8
4.5 Exponential Distribution

Another useful continuous probability distribution is the exponential distribution. It is closely


related to Poisson distribution. Poisson distribution is discrete and describes the number of
occurrences over an interval, whereas Exponential distribution is continuous and describes a
probability distribution of the time between random occurrences. The following are the
characteristics of Exponential Distribution:
• It is a positive skewed Distribution, which means the curve steadily decreases as the x
gets larger.
• It is a family of curves
• The x values ranges from 0 to infinity
• Its apex, i.e highest point is always at x=0

An exponential distribution is described by only one parameter, ʎ. Each value of ʎ gives a


different exponential distribution, thus resulting in a family of curves.

Figure 4

Figure 4 shows the exponential


distribution for three values of ʎ.
The points on the graph are
determined by various values in the
probability density function
formula.The exponential
probability density function
formula is as follows:

𝐹(𝑥) = 𝜆𝑒 −𝜆𝑥
𝑤ℎ𝑒𝑟𝑒 𝑥 ≥ 0 , 𝜆 > 0 𝑎𝑛𝑑 𝑒 = 2.71828

The mean of an exponential distribution is 𝜇 = 1⁄𝜆 and the standard deviation is 𝜎 = 1⁄𝜆

4.5.1 Probabilities of the Exponential Distribution

Probabilities are computed by determining the area under the curve between two points.
Applying calculus to the exponential probability density function gives a formula that can be
used to compute the probabilities of exponential distribution:

𝑃(𝑥 ≥ 𝑥0 ) = 𝑒 −𝜆𝑥0
Where, x0≥ 0, and is the fraction or the number of intervals between arrivals in the probability
question .

Let’s take an example to illustrate the computation of probabilities of an exponential distribution.


The arrivals at a restaurant are Poisson Distributed with a ʎ of 1.5 customers every minute. What
is the average time between arrivals and what is the probability that at least 2 minutes will elapse
between one arrival and next arrival?

The inter-arrival times of random variable is exponentially distributed. The mean of exponential
1
distribution can be calculated as- 𝜇 = 1⁄𝜆 = 1.5 = 0.667 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑜𝑟 40 𝑠𝑒𝑐𝑜𝑛𝑑𝑠. It means on
an average it will take 40 seconds between arrivals of two consecutive customers. The
probability of an interval of 2 or more minutes can be calculated as follows:

𝑃(x≥2/λ=1.5)= 𝑒 −1.5(2) = 0.0498


About 4.98 % of the time, when the rate of arrival is 15 per minute, 2 minute or more will elapse
between arrivals. If average rate of arrival ʎ is not given in the problem, it can be calculated by
transposing the formula; i.e𝜆 = 1⁄𝜇

Illustration: The exponential distribution can be used to solve Poisson type problems in which
the intervals are not time. The Air travel consumer report published that average number of
mishandled baggage occurrences is 4.06 per 1000 passengers .Assume mishandled baggage
occurrences is Poisson distributed. Determine the average number of passengers between
occurrences. Suppose a baggage is just been mishandled; what is the probability that the number
will be fewer than 190 passengers? What is the probability that it is between 190 and 495
passengers?

As the  = 4.06/ 1000 passengers; the mean of exponential distribution can be calculated as.
1
𝜇 = 1⁄𝜆 = = 0.2463
4.06
= 0.2463(1000) = 246.3
The formula for computing probability of exponential distribution is for x ≥ x0 value, however
we want to find the probability for fewer than 190 passengers in this problem. This can be
solved as:
x0 = 190/1,000 passengers = .19
P(x≥ 0.19)=e-x = e-4.06(.19) = e-.7714 = 0.4624
As the total area under the curve is 1, P(x< 190) 1 - .4624 = .5376
To find the probability between 190 and 495, let’s show the problem graphically :

190 495
𝑃(𝑥 ≥ 495) = 𝑒 −4.06(0.495) = 𝑒 −2.0097 = 0.1340
𝑊𝑒 ℎ𝑎𝑣𝑒 𝑎𝑙𝑟𝑒𝑎𝑑𝑦 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 𝑃(𝑥 ≥ 190) = 0.4624
From looking at the graph, we can easily see that the required shaded are can be computed by
subtracting P(x≥495) from P(x≥190)= 0.4624-0.1340= 0.3284
In operations research, Poisson distribution in conjunction with exponential distribution is used
to solve queuing problems. The Poisson distribution is used to analyse the arrivals in a queue and
exponential distribution is used to analyse inter-arrival time.

Check your progress 4


1. If arrivals at a bank followed a Poisson distribution, then the time between arrivals
would follow a binomial distribution.(True/False)
2. For an exponential distribution, the mean is always equal to its variance. (True /False)
3. At a certain workstation in an assembly line, the time required to assemble a component
is exponentially distributed with a mean time of 10 minutes. Find the probability that a
component is assembled in 3 to 7 minutes?
a) 0.5034
b) 0.2592
c) 0.2442
d) 0.2942
4. At a certain workstation in an assembly line, the time required to assemble a component
is exponentially distributed with a mean time of 10 minutes. Find the probability that a
component is assembled in 7 minutes or less?
a) 0.349
b) 0.591
c) 0.714
d) 0.503
5. On Saturdays, cars arrive at Shine Car Wash at the rate of 6 cars per fifteen minute
interval. The probability that at least 2 minutes will elapse between car arrivals is
_____________.
a) 0.0000
b) 0.4493
c) 0.1353
d) 1.0000

4.6 Let’s Us SumUp


In this unit we discussed three different continuous probability distributions- Uniform
distribution, Normal distribution and the exponential distribution. The probability of continuous
distribution is area under the curve and is equal to one. In fact the probability of any discrete
point in continuous distributions is 0.00. The simplest of the distribution is uniform distribution,
also known as rectangular distribution. The uniform distribution is determined from a probability
density function that contains equal values along some interval between the points a and b.
Probabilities are calculated by portion of rectangle between that two points a and b that is being
considered.

The most widely used distribution is the normal distribution. Many phenomena are normally
distributed like characteristics of machine parts, many measurements of natural environment,
human characteristics such as height, weight, IQ and test scores. The parameter necessary to
describe a normal distribution is mean and standard deviation. For convenience, the data should
be standardized by using the mean and standard deviation to compute z score. The probability of
the z score of an x value can be determined by the table of z scores. The normal distribution is
also used to work certain type of binomial distribution problems.

Another continuous distribution is the exponential distribution. It complements the discrete


Poisson distribution. The exponential distribution is used to compute the probabilities of times
between random occurrences. It is a family of curves described by parameter ʎ. The distribution
is skewed to the right and the highest point is at x=0.

4.7 Answers for Check Your Progress


Answers to check your progress 1

1. True
2. (a)

Answers to check your progress 2

1. True
2. (a)
3. (c)
4. (d)
5. (b)

Answers to check your progress 3

1.True
2. False
3.True
4. (d)
5. (c)
6. (d)

Answers to check your progress 4

1. False
2. False
3. (c)
4. (d)
5. (b)

4.8 Glossary
Uniform Probability Distribution: A continuous probability distribution in which the
probability that the random variable will assume a value in any interval of equal length is same
for each interval.
Probability Density function: The function that describes the probability distribution of a
continuous random variable
Normal Distribution: A continuous probability distribution whose probability density function
is bell shaped and is determined by the mean and standard deviation
Standard normal distribution: A normal distribution with mean of 0 and a standard deviation
of 1
Z Score: z score is the distance that an x value is from the mean µ in units of standard deviations
Exponential Distribution: A continuous probability distribution that is useful in describing the
time to complete a task or the time/interval between occurrences of an event

4.9 Assignment

1. What are continuous probability distributions? Discuss three major types.


2. Discuss the assumptions of normal distribution.
3. The Bureau of Labour statistics releases figures on the number of full time wage and salary
workers with flexible schedules. The numbers of full time wage and salary workers in each
age category are almost uniformly distributed by age, with ages ranging between 18 to 65
years. If the worker with a flexible schedule is randomly drawn from the workforce, what is
the probability that the worker will be between 25 to 50 years of age. What is the man and
height of the distribution?
4. The average speeds of passenger trains are normally distributed with a mean average speed
of 88 miles per hour and a standard deviation of 6.4 miles per hour. What is the probability
that a train will average less than 70 miles per hour? What is the probability that a train will
average between 90 and 100 miles per hour?
5. Inter-arrival times at a hospital emergency room during a weekday are exponential
distributed, with an average inter-arrival time of none minutes. If the arrivals are Poisson
distributed, what would the average number of arrivals per hour be? What is the probability
that less than five minutes elapse between any two arrivals?

4.10 Activities
Use the probability density formula to sketch the graphs of the following exponential
distributions (a) ʎ=0.2, (b)ʎ=0.4, (c) ʎ=0.4. Hint{ use x=0, 1,2, 3……and find f(x)}

4.11 Case Study


Design Point Engineers specializes in constructing the concrete foundation for new houses in
Kerala. The company knows that because of the soil types, moisture conditions, variable
construction and other factors eventually most foundation will need major repair. On the basis of
its records, the company’s president believes that a new house foundation on average will not
need major repair for 20 years. If she wants to guarantee the company’s work against major
repair but warns to have to honor no more than 10% of its guarantees, for how many years
should the company guarantee its work? Assume that occurrences of major foundation repairs
are Poisson Distributed.
4.12 Further Reading

1. Applied Business Statistics, Ken Black, Wiley Publications.


2. Business Statistics, David Mlevine et al, Pearson Education
3. Statistics for management, Levin and Rubin, Pearson Education
4. Business Statistics by J.K. Sharma, Pearson Education
5. Business Statistics by Naval Bajpai, Pearson Education
Block Summary

In this block, we studied how quantitative methods may be used to help managers make better
decisions. In the first unit the meaning and use of various quantitative analysis methods in the
field of business and management was explained. In this unit, the basic difference between
statistics and operations research was discussed along with their techniques. In the second unit,
the concept of measures of central tendency was introduced. Various measures of central
tendency and its relative importance were discussed. In the third unit the application of various
types of discrete probability distributions were discussed. In the last unit continuous probability
distributions and its various applications were covered.

.
Block Assignment

Short Answer Questions

1. Differentiate between a statistic and a parameter.


2. Discuss situations where geometric mean is a preferred measure of central tendency.
3. What do you mean by 75th percentile?
4. Give two examples each of discrete and continuous random variable

Long answer Questions

1. What is an iconic model? How it is different from analog model?


2. What is a discrete probability distribution? How do we describe a discrete probability
distribution?
3. Calculate the arithmetic mean and the median of the frequency distribution for given
below. Also calculate the mode using empirical relation among the mean , median and
mode :
Height (in cm) No. of students
130-134 05
135-139 15
140-144 28
145-149 24
150-154 17
155-159 10
160-164 01
4. Suppose the average speed of a passenger train travelling from Mumbai to Delhi is
normally distributed with a mean average speed of 88 miles per hour and the standard
deviation of 6.4 miles per hour.
a. What is the probability that a train will average less than 70 mile per hour?
b. What is the probability that a train will average more than 80 mile per hour?
c. What is the probability that a train will average between 90 and 100?

5. The Poisson distribution of annual trips per family to amusement parks gives average of
0.6 trips per year. What is the probability of randomly selected family did not make a trip
to an amusement park last year? What is the probability of randomly selected family took
three or fewer trips to amusement parks over a three years period?
Block Structure

Block no. 2 Decision making and forecasting methods


______________________________________________
Block Introduction

In this block, we will study decision making techniques which are used to make business
decisions and forecasting. In thefirst unit the concept of decision making along with decision tree
approach and other related concepts like single stage decisions, multi stage decisions, issues, and
types of environments of decisions will be discussed. In the second unit,we will explore
relationships between variables through correlation and regression analysis and learn how to
develop models that can be used to predict one variable by another variable.Here, we will also
learn to make meaningful predictions from the given data by fitting them into the linear function.
In the third unit some of the basic concepts of forecasting will be discussed for planning and
understanding decisions in a scientific approach. We will also explore the statistical techniques
that can be used to forecast values from time-series data and to know how well the forecast is
being done.

Objectives
After learning this block, you will be able to:
• Understand decision problems which involve various uncertainties in different types of
environments
• Understand the decision-making process
• Analyze problems using decision tree Approach
• Make decisions under uncertainty
• Analyze situations where probabilities of outcomes are uncertain
• Understand the concept of correlation
• Understand the role of regression in establishing mathematical relationships
betweendependent and independent variables from given data
• Use the least squares criterion to estimate the model parameters
• Learn the meaning and calculation of residuals
• Identify the standard errors of estimate

• Know when to use various forecasting methods.


• Understand different types of forecast models
• Understand time series analysis - moving averages, exponential smoothing, least square
regression trend analysis for demand forecasting.
• Calculate different measures of forecast accuracy.

Block Structure

Unit 1: Decision Theory

Unit 2: Correlation and Regression Analysis

Unit 3: Forecasting
Unit No. 1 Decision Theory
______________________________________________
Unit Structure
1.0 Learning Objectives

1.1 Introduction
1.1.1 Types of decision-making environments
Check your progress 1

1.2 Key problems in decision theory


Check your progress 2

1.3 Decision making process


1.3.1One-stage decision making process with uncertainty and risk
1.3.1.1 Criteria for decision-making under uncertainty
1.3.1.2 Decision-making under risk with EMV
1.3.2 Multi-stage decision making process with certainty (Decision Tree Approach)
Check your progress 3

1.4 Let Us Sum Up

1.5 Answers for Check your Progress

1.6 Glossary

1.7 Assignment

1.8 Activities

1.9 Case Study

1.10 Further Reading


1.0 Learning Objectives

After learning this unit, you will be able to:


• Understand decision problems which involve various uncertainties in different types of
environments
• Understand the decision-making process
• Analyze problems using decision tree Approach
• Make decisions under uncertainty
• Analyze situations where probabilities of outcomes are uncertain.

1.1 Introduction

At every stage of our life including day to day routine involves various kinds of decisions. The
decision problems are everywhere but altogether it deals with making good decisions too. Many
people from different time and fields, use decision theory under different environments to come
up with the final decisions. The analysis varies with the nature of the decision problem, so that
any classification base for decision problems provides us with a means to segregate the decision
analysis approach. An important condition for the existence of a decision problem is the presence
of alternative ways of actions. Each action leads to a consequence through a possible set of
outcomes based on the information might be known or unknown. One of the several ways of
classifying decision problems has been based on this knowledge about the information on
outcomes. Broadly, two classifications result:

a) The information on outcomes are deterministic and are known with certainty, and
b) The information on outcomes are probabilistic (uncertain), with the probabilities known or
unknown.

The former is classified as decision making under certainty, while the latter is called decision
making under uncertainty. The theory that has resulted from analyzing decision problems in
uncertain situations is commonly known as Decision Theory. The agenda of this unit is to study
some methods for solving decision problems under uncertainty. Decision theory is an analytic
and systematic approach for decision making. A good decision is one that is based on logic,
considers all available data and possible alternatives, and the quantitative approach described
here.

1.1.1 Types of decision-making environments

Type 1: Decision making under certainty: The decision maker knows with certainty the
consequences of every alternative or decision choice.
Type 2: Decision making under uncertainty: The decision maker does not know the probabilities
of the various outcomes.
Type 3: Decision making under risk: The decision maker knows the probabilities of the various
outcomes.
Check your progress 1
1. The information on outcomes are deterministic and are known with certainty is known
as____________

2. The necessary condition for the existence of decision problem is the presence
of___________

3. If decision-maker knows the probabilities of outcomes is


knownas___________________

4. Which theory concerns making sound decisions under conditions of certainty, risk and
uncertainty
a. Game Theory
b. Network Analysis
c. Decision Theory
d. None of the above

1.2 Key problems in decision theory

Different problems arise while analyzing decision problems under uncertain conditions
ofoutcomes. The first concept is, decisions can be viewed either as independent decisions (one
stage/one-time decision) or as decisions with the sequence of decisions that are taken over a
period of time. So, planning horizon is also the nature of decisions, we have either a single stage
decision problem, or a sequential decision problem. In real life, the decisions can be classified
generally as sequential and thus it becomes difficult to solve. Fortunately, valid assumptions in
most of the cases help to reduce the number of stages, and make the problem solvable. So,
decision theory deals with following two types problems basically.
(a) One-stage decision making process
(b) Multi-stage decision making process

Now consider the problem was to find the number of magazines copies one should stock inthe
face of uncertain demand, such that, the expected profit is maximize. A critical evaluation of the
method shows that the calculation becomes tedious as the number of values the demand is taking
increases. You can also try the method with a discrete distribution of demand, where demand can
take values between some range and then do trial and error for each and every value of demand
that is again time-consuming task. So. it calls for the separate techniques to make decisions. we
will learn techniques for solving such single stage problems called marginal analysis. For
sequential decision problems, the Decision Tree Approach is helpful and will be explained in a
later section.
In the analysis, we will be using some criteria but main is expected monetary value criteria (all
other criteria will be explained in next section). However, this criterion suffers from two
problems. Expected Profit or Expected Monetary Value (EMV), as it is more commonly known,
does not take into account the decision maker's attitude towards risk. The other problem with
Expected Monetary Value is that it can be applied only when the probabilities of outcomes are
known. For problems, where the probabilities are unknown, one way is to assign equal
probabilities to the outcomes, and then use EMV for decision-making. However, this is also not
always rational, and as other criteria are available for deciding on such situations.

Check your progress 2


1. One stage decision making process is known as____________________
2. The main criteria to deal with decision problem is__________________
3. Expected monetary value concept does not consider the decision
maker’s______________ for risk
4. EMV’s application is only when probabilities of____________ are known

1.3 Decision Making Process

The following are the steps of decision-making process which can be commonly used for
any approach:

1. Clearly define the problem at hand.


2. List the possible alternatives.
3. Identify the possible outcomes or states of nature.
4. List the payoff (typically profit) of each combination of alternatives and outcomes.
5. Select one of the mathematical decision theory models. (Marginal or decision tree
approach whichever is applicable)
6. Apply the model and make your decision.
Example I
Decision Table with Conditional Values for Krishna Manufacturer.

State of Nature
Alternative Favourable Market Unfavourable Market
Construct a large plant 200,000 -180,000
Construct a small plant 100,000 -20,000
Do nothing 0 0

1.3.1 One-stage decision making process with uncertainty and risk

Example I is the example of decision-making with uncertainty as well as under risk as no


probabilities are associated with any decision. For decision making under uncertainty following
criteria can be used.

1.3.1.1 Criteria for decision-making under uncertainty

1. Maximax (optimistic): Used to find the alternative that maximizes the maximum payoff.
Locate the maximum payoff for each alternative.Select the alternative with the maximum
number.

State of Nature
Alternative Favourable Market Unfavourable Market Maximum in a row
Construct a large plant 200,000 -180,000 200,000
Construct a small plant 100,000 -20,000 100,000
Do nothing 0 0 0

2. Maximin (pessimistic): Used to find the alternative that maximizes the minimum payoff.
Locate the minimum payoff for each alternative.Select the alternative with the minimum number.

State of Nature
Alternative Favourable Market Unfavourable Market Minimum in a row
Construct a large plant 200,000 -180,000 200,000
Construct a small plant 100,000 -20,000 100,000
Do nothing 0 0 0

3. Criterion of realism (Hurwicz):This is a weighted average compromise between optimism


and pessimism.Select a coefficient of realism , with 0≤α≤1.A value of 1 is perfectly optimistic,
while a value of 0 is perfectly pessimistic.Compute the weighted averages for each
alternative.Select the alternative with the highest value. Any value of  can be considered as per
your knowledge between 0 to 1 as explained.

Weighted average =  * (maximum in row) + (1 – ) * (minimum in row)

For the large plant alternative using  = 0.8:


(0.8)(200,000) + (1 – 0.8)(–180,000) = 124,000

For the small plant alternative using  = 0.8:


(0.8)(100,000) + (1 – 0.8)(–20,000) = 76,000

State of Nature
Alternative Favourable Unfavourable Market Maximum = Realism
Market
Construct a large plant 200,000 -180,000 1,24,000
Construct a small plant 100,000 -20,000 76,000
Do nothing 0 0 0

4. Equally likely (Laplace): Considers all the payoffs for each alternative with highest average.
Find the average payoff for each alternative.Select the alternative with the highest average.

State of Nature
Alternative Favourable Market Unfavourable Market Highest Average
Construct a large plant 200,000 -180,000 10,000
Construct a small plant 100,000 -20,000 40,000
Do nothing 0 0 0

1.3.1.2 Decision-making under risk with EMV

This is decision making when there are several possible states of nature, and the probabilities
associated with each possible state are known.The most popular method is to choose the
alternative with the highest expected monetary value (EMV).

EMV = (payoff of first state of nature) x (probability of first state of nature)


+ (payoff of second state of nature)
x (probability of second state of nature)
+ … + (payoff of last state of nature)
x (probability of last state of nature)

Suppose in example I each market outcome has a probability of occurrence of 0.50.Which


alternative would give the highest EMV? Calculations are as follows: Select the alternative with
highest EMV.
EMV (large plant) = (200,000)(0.5) + (–180,000)(0.5)= 10,000
EMV (small plant) = (100,000)(0.5) + (–20,000)(0.5)= 40,000
EMV (do nothing) = (0)(0.5) + (0)(0.5)= 0
1.3.2 Multi-stage decision making process with certainty (Decision Tree
Approach)

Any problem that can be presented in a decision table can also be graphically represented in a
decision tree.Decision trees are most beneficial when a sequence of decisions must be made. All
decision trees contain decision points or nodes, from which one of several alternatives may be
chosen.All decision trees contain state-of-nature points or nodes, out of which one state of nature
will occur.

Steps of decision tree analysis

1. Define the problem.


2. Structure or draw the decision tree.
3. Assign probabilities to the states of nature.
4. Estimate payoffs for each possible combination of alternatives and states of nature.
5. Solve the problem by computing expected monetary values (EMVs) for each state of
nature node.

Structure of decision-tree

Trees start from left to right.


Trees represent decisions and outcomes in sequential order.
• Squares represent decision nodes.
• Circles represent states of nature nodes.
• Lines or branches connect the decisions nodes and the states of nature.

Basic structure of Decision tree of example I


Final Solution of decision-tree of example I with calculation
Check your progress 3
1. In the Hurwicz criteria the value of  is always between _________
2. Select the alternative with highest average payoff is given in the rule of _____________
3. When number of alternatives with probabilities/certainties are given, is known
as______________

1.4 Let Us Sum Up

Decision Theory provides us with the structure and methods for analyzing decision problems
under uncertainty, certainty and risk. The decision problems under uncertainty are characterized
by different courses of action and uncertain or risky outcomes corresponding to each action or
alternative. The problems can involve a single stage or a multi-stage decision process. Expected
monetary value and other different criterions is helpful in solving single stage problems, whereas
the decision tree approach is useful for solving multi-stage problems. In this unit we have learned
the applications of these methods to solve decision problems. The main objective behind using
decision making methods is of maximizing the Expected Monetary Value (EMV). So ultimate
goal by finding EMV with both the methods is basically assumes that the decision maker does
not want to take risk or he/she wants to be neutral or decision maker can make approximate
decisions based on the outcomes discovered.

1.5Answers for Check your Progress

Answers to Check your progress 1


1. Decision making under certainty

2. Alternative ways of actions

3. Decision making under risk

4. C

Answers to Check your progress 2


1. Marginal Analysis

2. EMV

3. Attitude

4. Outcomes
Answers to Check your progress 3
1. 0 to 1

2. Laplace

3. Decision Tree Approach

1.6 Glossary

Decision making under certainty: The decision maker knows with certainty the consequences
of every alternative or decision choice.

Decision making under uncertainty: The decision maker does not know the probabilities of the
various outcomes.

Decision making under risk: The decision maker knows the probabilities of the various
outcomes.

Maximax (optimistic): Used to find the alternative that maximizes the maximum payoff.

Maximin (pessimistic): Used to find the alternative that maximizes the minimum payoff.

Criterion of realism (Hurwicz):This is a weighted average compromise between optimism and


pessimism.

Equally likely (Laplace): Considers all the payoffs for each alternative with highest average.

EMV: The highest expected monetary value means payoff of particular decision multiply by
probability of occurrence.

Decision Tree: It represents decisions and outcomes in sequential order.

Squares in decision tree: It represents decision nodes.

Circles in decision tree: It represents states of nature nodes.

Lines or branches in decision tree: It connects the decisions nodes and the states of nature.

1.7 Assignment
1. A small group of investors is considering planting a tree farm. Their choices are (1) don’t plan
trees, (2) plant a small number of trees, or (3) plant a large number of trees. The investors are
concerned about the demand for trees. If demand for trees declines, planting a large tree farm
would probably result in a loss. However, if a large increase in the demand for trees occurs, not
planting a tree farm could mean a large loss in revenue opportunity. They determine that three
states of demand are possible: (1) demand declines, (2) demand remains the same as it is, and (3)
demand increases. Use the following decision table to compute an expected monetary value for
this decision opportunity. Also show decision tree for the same.

State of Demand
Decision Alternatives Decline (0.20) Same (0.30) Increase (0.50)
Don’t Plant 30 0 -40
Small Tree Farm -80 15 190
Large Tree Farm -550 -120 750

2. Some oil speculators are interested in drilling an oil well. The rights to the land have been
secured and they must decide whether to drill. The states of nature are that oil is present or that
no oil is present. Their two decision alternatives are drill or don’t drill. If they strike oil, the well
will pay 2 million. If they have a dry hole, they will lose 150,000. If they don’t drill, their
payoffs are 0 rs. when oil is present and 0 rs when it is not. The probability that oil is present is
.12. Use this information to construct a decision table and decision tree and compute an expected
monetary value for this problem.

3. A car rental agency faces the decision of buying a fleet of cars, all of which will be the same
size. It can purchase a fleet of small cars, medium cars, or large cars. The smallest cars are the
most fuel efficient and the largest cars are the greatest fuel users. One of the problems for the
decision makers is that they do not know whether the price of fuel will increase or decrease in
the near future. If the price increases, the small cars are likely to be most popular. If the price
decreases, customers may demand the larger cars. Following is a decision table with these
decision alternatives, the states of nature, the probabilities, and the payoffs. Use this information
to determine the expected monetary value for this problem.

State of Nature
Decision Alternatives Fuel Decrease (0.70) Fuel Increase (0.30)
Small Cars 225 450
Medium Cars -175 -135
Large Cars 400 380

1.8Activities

1. Suppose you have the option of investing either in Project A or in Project B. The outcomes of
both the projects are uncertain. If you invest in Project A, there is a 98% chance of making Rs.
25,000 profit, and 2% chance of losing Rs. 90,000. If project B is chosen, there is a 50-50 chance
of making a profit of Rs. 7,000 or Rs. 17,000. Which project will you choose and why?
2. Suppose in above activity 1, you have calculated the expected payoff (EMV) for both the
projects as follows.
EMVA = 0.98 * 25,000 - 0.02 * 90,000 = Rs. 26,300.
EMVB = 0.5 * 7,000- 0.5 * 17,000 = Rs. 12,000.
You have thus found that by investing in Project A, you can expect more money, so you have
chosen A. Your friend, when given the same option, chooses B, arguing that he would not like to
go bankrupt (losing 90,000) by choosing A. How do you reconcile these two arguments?

1.9 Case Study


The Property Company: A property owner is faced with a choice of:
(a) A large-scale investment (A) to improve her flats. This could produce a substantial pay-off in
terms of increased revenue net of costs but will require an investment of Rs.1,400,000. After
extensive market research it is considered that there is a 40% chance that a pay-off
of Rs.2,500,000 will be obtained, but there is a 60% chance that it will be only Rs.800,000.

(b) A smaller scale project (B) to re-decorate her premises. At Rs.500,000 this is less costly but
will produce a lower pay-off. Research data suggests a 30% chance of a gain of Rs.1,000,000 but
a 70% chance of it being only Rs.500,000.

(c) Continuing the present operation without change (C). It will cost nothing, but neither will it
produce any pay-off. Clients will be unhappy and it will become harder and harder to rent the
flats out when they become free.

How will a decision tree help the taking of the decision?

1.10 Further Reading

1. Business Statistics: For Contemporary Decision Making, by Ken Black, Wiley Publication
2. Quantitative Techniques in Management, by N.D. Vora, McGraw hills
3. Operations Research theory and Applications, by J.K. Sharma, Macmillan
4. Operations Research, By Hamdy A Taha, Pearson Education
5. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E. Hanna
and T. N. Badri, Pearson Publication
6. Statistics for management, Levin and Rubin, Pearson Education
7. Business Statistics, David M. levine et al, Pearson Education
8. Use of software like QM for Windows, Excel Solver
Unit No. 2 Correlation and Regression
Analysis
_________________________________
Unit Structure
2.0 Learning Objectives

2.1 Introduction

2.2 Pearson Product Moment Correlation Coefficient (r)


Check your progress 1

2.3 Simple Regression Analysis


2.3.1 Residual analysis
2.3.2 Standard Error of the Estimate
Check your progress 2

2.4 Coefficient of Determination (r)2


2.4.1 Relationship between Correlation Coefficient and Coefficient of
Determination
Check your progress 3

2.5 Let Us Sum Up

2.6 Answers for Check your Progress

2.7 Glossary

2.8 Assignment

2.9 Activities

2.10 Case Study

2.11 Further Reading


2.0 Learning Objectives

After learning this unit, you will be able to:

• Understand the concept of correlation


• Understand the role of regression in establishing mathematical relationships between
dependent and independent variables from given data
• Use the least squares criterion to estimate the model parameters
• Learn the meaning and calculation of residuals
• Identify the standard errors of estimate

2.1Introduction

In industry and business today, large amounts of data are continuously being generateand
thus it calls for statistical analysis of mass data. Data is an asset for any business.This
data can be company's annual production, annual sales, capacity utilization, turnover,
profits, man-power levels, absenteeism or some other variable of direct interest to
management. In general, the data can be of any of the aspects related to finance,
marketing, human resource, inventory, productionor there might be technical data
regarding processes such as temperature, pressure etc. Sometimes it is related to quality
control issues. The accumulated data can be used to gain information about the system
(as for instance what happens to the market return when Sensex goes down) or to identify
past pattern of trends, behavior or simply used for control purposes to check if the
process or system is operating as planned and designed (as for instance in quality
control). So main objective to learn correlation and regression is primarily for extracting
the main features of the relationships and impacts hidden in or implied by the mass of
data.

The data we analyzecan have many variables and it is of interest to examine the effects
that some variables on others. To identify the exact functional relationship between
variables can be too complex but we may wish to approximate relationship by some
simple mathematical function such as correlation and straight line or least square line.
For instance, the monthly consumption of raw materials at a particular company, daily
demand of a particular product, weekly price change in petrol could all be variables of
interest. We are, however, interested in some key performance variables (let us consider
sales and advertisement) would like to see how this key variable (called the response
variable or dependent variable, here sales) is affected by the other variables (often called
independent or explanatory variable, here advertisement).

2.2 Pearson Product Moment Correlation Coefficient (r)


Correlation is a measure of the degree of relatedness of variables. It can help a business
researcher determine, for example, whether the stocks of two airlines rise and fall in any related
manner. For a sample of pairs of data, correlation analysis can yield a numerical value that
represents the degree of relatedness of the two stock prices over time. In the transportation
industry, is a correlation evident between the price of transportation and the weight of the object
being shipped? If so, how strong are the correlations? In economics, how strong is the correlation
between the producer price index and the unemployment rate? In retail sales, are sales related to
population density, number of competitors, size of the store, amount of advertising, or other
variables? Researchers virtually always deal with sample data, this section introduces a widely
used sample coefficient of correlation, r. This measure is applicable only if both variables being
analyzed have at least an interval level of data. The statistic r is the Pearson product-moment
correlation coefficient, named after Karl Pearson (1857–1936), an English statistician who
developed several coefficients of correlation along with other significant statistical concepts. The
term r is a measure of the linear correlation of two variables. It is a number that ranges from -1 to
0 to +1, representing the strength of the relationship between the variables. An r value of +1
denotes a perfect positive relationship between two sets of numbers. An r value of -1 denotes a
perfect negative correlation, which indicates an inverse relationship between two variables: as
one variable gets larger, the other gets smaller. An r value of 0 means no linear relationship is
present between the two variables.
 x y
 xy − n
r =
 ( x) 2   ( y ) 2 
 −   − 
2 2
x y
 n   n 

(a)Strong Negative Correlation (r = –.933)


(b) Moderate Negative Correlation (r = –.674)
(c) Moderate Positive Correlation (r = .518)
(d) Strong Positive Correlation (r = .909)
(e) Virtually No Correlation (r = –.004)

Example I

A study is designed to check the relationship between smoking and longevity. A sample of 15
men 50 years and older was taken and the average number of cigarettes smoked per day and their
age at death was measured. Here cigarettes smoking is independent variable (X) and Longevity
is dependent variable (Y). n is number of pairs = 15

Time Cigarettes Longevity X*X=


(X) (Y) X*Y (X)2 Y * Y = (Y)2
1 5 80 400 25 6400
2 23 78 1794 529 6084
3 25 60 1500 625 3600
4 48 53 2544 2304 2809
5 17 85 1445 289 7225
6 8 84 672 64 7056
7 4 73 292 16 5329
8 26 79 2054 676 6241
9 11 81 891 121 6561
10 19 75 1425 361 5625
11 14 68 952 196 4624
12 35 72 2520 1225 5184
13 29 58 1682 841 3364
14 4 92 368 16 8464
15 23 65 1495 529 4225
∑Y = ∑X*Y = ∑X* X ∑Y * Y =
∑X = 291 1103 20034 =7817 82791

Put all the calculated values in the formula learned above. Answer is = -0.71343, so moderate
negative (Variables are related reciprocally) correlation between two variables. In conclusion, if
cigarettes smoking is less, then Longevity of life is more.

Check your progress 1


1. Correlation value must be

a)0 and 1
b)-1 to 0 to 1
c)-1
d)None of the above

2. If value of r is -0. 65 between two variables, the type of correlation is

a) Strong Negative Correlation


b) Strong Positive Correlation
c)Moderate Negative Correlation
d)No correlation

3. The correlation coefficient is used to determine:

a) A specific value of the y-variable given a specific value of the x-variable


b) A specific value of the x-variable given a specific value of the y-variable
c) The strength of the relationship between the x and y variables
d) None of these

2.3 Simple Regression Analysis


Regression analysis is the process of constructing a mathematical model or function that can be
used to predict or determine one variable by another variable. The most elementary regression
model is called simple regression or bivariate regression involving two variables in which one
variable is predicted by another variable. In simple regression, the variable to be predicted is
called the dependent variable and is designated as Y. The predictor is called the independent
variable, or explanatory variable, and is designated as X. In simple regression analysis, only a
straight-line relationship between two variables is examined.

Equation of The Simple Regression Line:

ŷ = b0 + b1x
where

Y is the dependent variable (that’s the variable that goes on the Y axis), X is the independent
variable (i.e. it is plotted on the X axis), b is the slope of the line and a is the y-intercept.

Example II
In the table below, the xi column shows scores on the aptitude test andyi column shows statistics
grades. Conduct the regression analysis, residual analysis and standard error of estimate.

Student Aptitude Statistics (𝑥− 𝑥)2 (𝑦− 𝑦)2 (𝑥− 𝑥̅)(𝑦− 𝑦̅)
Marks (x) Marks (y)
1 95 85 289 64 136
2 85 95 49 324 126
3 80 70 4 49 -14
4 70 65 64 144 96
5 60 70 324 49 126
∑x = 390 ∑y = 385 Σ(𝑥− 𝑥)2 = 730 Σ(𝑦− 𝑦̅)2 = 630 Σ (𝑥− 𝑥̅)(𝑦− 𝑦̅) = 470
Mean Mean
𝑥̅ = 78 𝑦̅ = 77

First, we solve for the regression coefficient (b1):

𝑏1= Σ(𝑥− 𝑥̅)(𝑦− 𝑦̅)/Σ(𝑥− 𝑥̅)2


b1 = 470/730
b1 = 0.644

Once we know the value of the regression coefficient (b1), we can solve for the regression slope
(b0):

b0 = 𝑦̅− 𝑏1 𝑥̅
b0 = 77 - (0.644)(78)
b0 = 26.768

Therefore, the regression equation is: ŷ = 26.768 + 0.644x .

Now you can predict value of statistics marks(Y) by any value of aptitude marks (X). Let us
consider that if student scores 88 marks in aptitude test, what will his/her score in statistics?
Here, X = 88, transfer this value in developed regression equation: ŷ = 26.768 + 0.644 * 88 =
83.44 marks in statistics.

2.3.1 Residual analysis

Each difference between the actual y values and the predicted y values is the error ofthe
regression line at a given point, and is referred to as the residual. It is the sum ofsquares of these
residuals that is minimized to find the least squares line. You can find predicted y values by
putting x values one by one in regression line that has been already developed.

Student Aptitude Statistics ŷ = 26.768 +


Marks (x) Marks (y) 0.644x y- ŷ
1 95 85 87.948 -2.948
2 85 95 81.508 13.492
3 80 70 78.288 -8.288
4 70 65 71.848 -6.848
5 60 70 65.408 4.592
Σ(𝑦− 𝑦̅) = 0.00

2.3.2 Standard Error of the Estimate

Residuals represent errors of estimation for individual points.With large samples of data,residual
computations become laborious. Even with computers, a researcher sometimeshasdifficulty
working through pages of residuals in an effort to understand the error of theregression model.
An alternative way of examining the error of the model is the standarderror ofthe estimate, which
provides a single measurement of the regression error.Because the sum of the residuals is zero,
attempting to determine the total amount oferror by summing the residualsis fruitless. This zero-
sum characteristic of residuals can beavoided by squaring the residuals and then summing them.

Student Aptitude Statistics


Marks Marks ŷ = 26.768 (𝑦− 𝑦̅)2 𝑦2
(x) (y) + 0.644x (𝑦− ŷ)
1 95 85 87.948 -2.948 8.690 7225
2 85 95 81.508 13.492 182.03 9025
3 80 70 78.288 -8.288 68.690 4900
4 70 65 71.848 -6.848 46.895 4225
5 60 70 65.408 4.592 21.086 4900
Σ(𝑦− 𝑦̅)2= Σ𝑦2=30275
Σ(𝑦− 𝑦̅) = 0.00 327.391

First calculate 𝑆𝑆𝐸= Σ(𝑦− 𝑦̂)2 ORΣ𝑦2− 𝑏0Σ𝑦− 𝑏1Σ𝑥𝑦

Standard error of the estimate


𝑠𝑒= √𝑆𝑆𝐸/𝑛−2 = 10.44
Check your progress 2
1. The relationship between number of beers consumed (x) and blood alcohol content (y)
was studied in 16 male college students by using least squares regression. The following
regression equation was obtained from this study:
ŷ = -0.0127 + 0.0180x
The above equation implies that:

a) Each beer consumed increases blood alcohol by 1.27%


b) On average it takes 1.8 beers to increase blood alcohol content by 1%
c) Each beer consumed increases blood alcohol by an average of amount of 1.8%
d) Each beer consumed increases blood alcohol by exactly 0.018

2. If two variables, x and y, have a very strong linear relationship, then

a) There is evidence that x causes a change in y


b) There is evidence that y causes a change in x
c) There might not be any causal relationship between x and y
d) None of these alternatives is correct.

3. In regression analysis, the variable that is being predicted is the

a) Response, or dependent, variable


b) Independent variable
c) Intervening variable
d) Is usually x

2.4 Coefficient of Determination (r)2

A widely used measure of fit for regression models is the coefficient of determination, or r 2.
The coefficient of determination is the proportion of variability of the dependent variable(y)
accounted for or explained by the independent variable (x). The coefficient of determination
ranges from 0 to 1. An r 2 of zero means that the predictor accounts for none of the variability of
the dependent variable and that there is no regression prediction of y by x. An r 2 of 1 means
perfect prediction of y by x and that 100% of the variability of y is accounted for by x. Of course,
most r 2 values are between the extremes. The researcher must interpret whether a particular r 2
is high or low, depending on the use of the model and the context within which the model was
developed. In the correlation example answer is r = -0.71, so square of that is = -.5041. That
means 50% of the variation is explained by independent variable x on dependent variable y.

2.4.1 Relationship between Correlation Coefficient (r) and Coefficient of


Determination (r)2

Is r, the coefficient of correlation related to r 2, the coefficient of determination in linear


regression? The answer is yes: r 2 equals (r)2. The coefficient of determination is the square of
the coefficient of correlation. i.e. A regression model was developed to predict FTEs by number
of hospital beds. The r 2 value for the model was .886. Taking the square root of this value yields
r = .941, which is the correlation between the sample number of beds and FTEs.

Note:
Because r 2 is always positive, solving for r by taking gives the correct magnitude of r but may
give the wrong sign. The researcher must examine the sign of the slope of the regression line to
determine whether a positive or negative relationship exists between the variables and then
assign the appropriate sign to the correlation value.

Check your progress 3


1. The coefficient of determination equals if r = 0.8045

a) 0.6471
b) -0.6471
c) 0
d) 1
2. Suppose the correlation coefficient between height (as measured in feet) versus
weight (as measured in pounds) is 0.40. What is the correlation coefficient of height
measured in inches versus weight measured in ounces? [12 inches = one foot; 16 ounces
= one pound]

a) 0.40
b) 0.30
c) 0.533
d) Cannot be determined from information given

3. manager of a car dealership believes there is a relationshipbetween the number of salespeople


on duty and thenumber of cars sold. Suppose the following sample isused to develop a simple
regression model to predict the number of cars sold by the number of salespeople. Solvefor r 2
and explain what r 2 means in this problem.

Week Cars Sold Salespeople


1 79 6
2 64 6
3 49 4
4 23 2
5 52 3

2.5 Let Us Sum Up


In this unit we have learned basics of correlation and linear regression. Correlation gives
answer as whether two variables are related to each other or not in terms of positively or
negatively. So, correlation gives answer of “how”. But regression gives an extended
answer that how much change you can expect or predict based on the relation. As in
regression line, change in dependent variable Y can be predicted with any value of
independent variable X. Broadly speaking, the fitting of any chosen mathematical
function to given data is termed as regression analysis. The estimation of the parameters
of this model is accomplished by the least squares criterion which tries to minimize the
sum of squares of the errors for all the data points. After the model is fitted to data the
next logical question is to find out how good the quality of fit is. This question can best
be answered by conducting statistical tests and determining the standard errors of
estimate. An overall percentage variation by coefficient of determination can also be
computed. Finally, it can be concluded that the method of least squares used in linear
regression is applicable to different range of situations. Correlation and regression both
are important concepts to establishing relationships between variables from the given
data. The identified relationship and mathematical model may be used for the purpose of
prediction. Some of the models used in forecasting of demand based on regression-
analysis. One of the models of forecasting, named Time -series analysis is discussed in
next unit.

2.6Answers for Check your Progress

Answers to check your progress 1

1. b

2. c

3. c
Answers to check your progress 2

1. c

2. c

3. a

Answers to check your progress 3

1. a

2. a

3. r2 = 0.826
2.7 Glossary

Independent variable: A variable that can be set either to a desirable value or takes
values that can be observed but not controlled.

Dependent/Response variable: The variable of interest or focus which is influenced by one or


more independent variable

Estimate: A value obtained from data for a certain parameter of the assumed model
or a forecast value obtained from the model.

Linear regression: Fitting of any chosen mathematical model, linear in unknown


parameters, to a given data.

Model: A general mathematical relationship relating a dependent (or response)


variable Y to independent variables X1 , X2 ……, Xn.

2.8 Assignment

1.Data on advertisingexpenditures (AE) and revenue (R) for the Four Seasons Restaurant is
givenbelow. Figures are in1000s.
AE 1 2 4 6 10 14 20
R 19 32 44 40 52 53 54
Answer Following:

a)Develop an estimated regression equation on revenue on advertising expenditure.

b)What is the estimated revenue when the advertising expenditure is 7?

c)Suppose SSR = 691 and SST = 1002. Find the value of R2 and interpret the same in the
context of the problem

2.Use the following data to determine the correlation equation of the least square regressionline.
X 12 21 28 8 20
Y 17 15 22 19 24

3.What is the measure of correlation between the interest rate of federal funds and
thecommodities futures index? Use the following data:

Days Interest Rate Future Index


1 7.43 223
2 7.48 221
3 8.00 222
4 7.75 226
5 7.58 225
6 7.64 223
7 7.69 224
8 8.01 221
9 8.23 227
10 8.45 235
11 8.52 241
12 8.56 238

4.Find the equation of the regression line for the following data and compute the residuals.

X 15 8 21 15 6 8 3
Y 45 38 55 46 24 33 49

2.9 Activities

A student is required to collect the stock price and stock return of last 15 days of any particular
stock from “money control”. Now, identify independent and dependent variable, find
pearsoncorrelation coefficient and regression line and comment on the outcome.

2.10 Case Study

According to the Capital Asset Pricing Model (CAPM), the risk associatedwith a capital asset
isproportional to the slope β1 (or simply β : Regressioncoefficient Y on X) obtained by
regressingthe assets past returns with thecorresponding return of the average portfolio called the
marketportfolio. (Thereturn of the market portfolio represents the return earned by the
averageinvestor.It is a weighted average of the returns from all the assets in themarket. The
larger the slope of β on of an asset, the larger is the riskassociated with that asset. A β of
1.00 representsaverage risk. The return fromIT firm’s stock and the corresponding returns for the
marketportfolio for thepast 10 years are given below:

Market Return (X) 16 12 11 17 14 13 18 15 08 10


Stock’s Return (Y) 21 17 14 22 16 15 24 18 05 08

Answer the following questions:

1. What are the independent and dependent variables?


2. Carry out the regression and find the β for the stock. What is the regression equation?
3. Does the value of the slope indicate that the stock has above average risk? (in the
range of 1± 0.1, interpret the risk.)
4. If the market portfolio return for the current year is 25%, what is the stocks return?
5. Calculate standard error of estimate
6. Calculate the Pearson correlation co-efficient and coefficient of the determination and
state its interpretation.
7. Carry out residual analysis for each value

2.11 Further Reading

1. Business Statistics: For Contemporary Decision Making, by Ken Black, Wiley


Publication
2. Quantitative Techniques in Management, by N.D. Vora, McGraw hills
3. Operations Research theory and Applications, by J.K. Sharma, Macmillan
4. Operations Research, By Hamdy A Taha, Pearson Education
5. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
6. Statistics for management, Levin and Rubin, Pearson Education
7. Business Statistics, David M. levine et al, Pearson Education
8. Use of software like QM for Windows, Excel Solver
Unit No. 3Forecasting
______________________________________________
Unit Structure
3.0 Learning Objectives

3.1 Introduction

3.2 General Steps of forecasting techniques

3.3 Types of Forecasts Models


Check your progress 1

3.4 Time-Series Analysis


3.4.1 Components of Time-Series Analysis
3.4.2 Moving Average
3.4.3 Exponential Smoothing
3.4.4 Measures of Forecast Accuracy
Check your progress 2

3.5 Least Square Regression Analysis


Check your progress 3

3.6 Application Areas of Forecasting

3.7 Let Us Sum Up

3.8 Answers for Check your Progress

3.9 Glossary

3.10 Assignment

3.11 Activities

3.12 Case Study

3.13 Further Reading

3.0Learning Objectives
After learning this unit, you will be able to,

• Know when to use various forecasting methods.


• Understand different types of forecast models
• Understand time series analysis - moving averages, exponential smoothing, least square
regression trend analysis for demand forecasting.
• Calculate different measures of forecast accuracy.

3.1Introduction

Forecasting is a technique that in our day to day life or routine we use. Every day, forecasting is
used in the decision making as a science and ort of predicting and then planning future
accordingly. It is a process which helps business people to reach conclusions about buying,
selling, producing, hiring, planning, manufacturing, inventory management and many other
actions. As an example, consider the following:

• Market watchers predict a low and high price, return on stock values short term, medium
term and long term.
• City planners forecast rain, temperature etc. in a particular city.
• Rising demand of laptops
• Predicting the future for paper industry
• Life insurance outlooks for number of claims for the next year.
• Trends or change in demand for clothing or apparels over the period of time
• Change in habit of eating over the period of time

How are these and other conclusions reached? What forecasting techniques are used? Arethe
forecasts accurate? Here we will discuss several forecasting techniques, how to measurethe error
of a forecast, and some of the problems that can occur in forecasting.Managers are always trying
to reduce uncertainty and make better estimates of what will happen in the future.This is the
main purpose of forecasting.So, in this unit will focus only on quantitative and causal models
where data occur over time, time-series data.Time-series data are data gathered on a given
characteristic over a period of time atregular intervals. Time-series forecasting techniques
attempt to account for changes over time by examining patterns, cycles, or trends, or using
information about previous timeperiods to predict the outcome for a future time period. Time-
series methods include Moving averages, Exponential smoothing, Least square regression trend
analysis.

3.2 General Steps of forecasting techniques


These steps are a systematic way of initiating, designing, and implementing a forecasting
system.When used regularly over the period of time, data is collected routinely and calculations
performed automatically.There may be one powerful forecasting system. But different
organizations may use different techniques.
• Determine the use of the forecast—what objective are we trying to obtain?
• Select the items or quantities that are to be forecasted.
• Determine the time horizon of the forecast.
• Select the forecasting model or models.
• Gather the data needed to make the forecast.
• Validate the forecasting model.
• Make the forecast.
• Implement the results.

3.3 Types of Forecasts Models

Forecasts models can be divided into three parts.


1. Qualitative models: incorporate judgmental or subjective factors.These are useful when
subjective factors are thought to be important or when accurate quantitative data is difficult to
obtain.Common qualitative techniques are:

Delphi Method: This is an iterative group process where (possibly geographically dispersed)
respondents provide input to decision makers.

Sales Force Composite:This allows individual salespersons estimate the sales in their region
and the data is compiled at a district or national level.

Consumer Market Survey:Input is solicited from customers or potential customers regarding


their purchasing plans.

2. Time-series models: attempt to predict the future based on the past.Common time-series
models are:Moving average, Exponential smoothing, Trend projections.

3. Causal models: use variables or factors that might influence the quantity being forecasted.The
objective is to build a model with the best statistical relationship between the variable being
forecast and the independent variables.Regression analysis is the most common technique used
in causal modeling.
Check your progress 1
1.To apply causal model approach which following concept can be used:
a) Regression Analysis
b) Decision Theory
C) Moving Average
d) Exponential Smoothing

2. Delphi approach is useful for _________________ analysis based on__________

3. Forecasting is useful for predicting_____________ for the sales in a company

3.4 Time-Series Analysis

A time series is a sequence of evenly spaced events.Time-series forecasts predict the future
based solely on the past values of the variable, and other variables are ignored.

3.4.1 Components of Time-Series Analysis: A time series typically has four components:

Trend (T): is the gradual upward or downward movement of the data over time. This is for the
longer period of time generally more than five years. Trend change in preference of mobile
phones, selection of new homes etc.

Seasonal Change (S): is a pattern of demand fluctuations above or below the trend line that
repeats at regular intervals. This is the year by year, month by month change. i.e. Flu disease
every year during monsoon season. Generally, for short period of time, for less than a year.

Cycles (C): are patterns in annual data that occur every several years. i.e. Every 5 years election
is there for choosing new prime minister, every 10 years Census calculation is done by
government etc.

Random/Irregular variations (R): data caused by chance or unusual situations, and follow no
discernible pattern. There is no time period here, data can change rapidly or slowly at any point
of time.

3.4.2 Moving Average

Moving averages can be used when demand is relatively steady over time.The next forecast is
the average of the most recent n data values from the time series. This method tends to smooth
out short-term irregularities in the data series.
Moving Average Forecast = Sum of demand in previous n periods / n

Mathematically,

Ft+1 = Yt + Yt-1 + ….. + Yt-n+1 / n

Where,
Ft+1 = forecast for time period t + 1
Yt = actual value in time period t
n = number of periods to average

Example I

The demand for a product in each of the last five months is shown below.

Month 1 2 3 4 5
Demand ('00s) 13 17 19 23 24

Use a two-month moving average to generate a forecast for demand in month 6.

Solution of Example I
The two-month moving average for months two to five is given by:

m2 = (13 + 17)/2 = 15.0


m3 = (17 + 19)/2 = 18.0
m4 = (19 + 23)/2 = 21.0
m5 = (23 + 24)/2 = 23.5

The forecast for month six is just the moving average for the month before that i.e. the moving
average for month 5= m5 = 2350.

3.4.3 Exponential Smoothing

Exponential smoothing is a type of moving average that is easy to use and requires little record
keeping of data. the new estimate is the old estimate plus some fraction of the error in the last
period. The general approach is to develop trial forecasts with different values of  and select the
with lowest mean absolute deviation (MAD) which will be discussed in next section.

New forecast = Last period’s forecast+  * (Last period’s actual demand – Last period’s
forecast)

Where,
 is a weight (or smoothing constant) in which 0≤≤1.

Mathematically,
Ft+1 = Ft +  * (Yt – Ft)

Where:
Ft+1 = New forecast (for time period t + 1)
Ft = Pervious forecast (for time period t)
 = Smoothing constant (0 ≤  ≤ 1)
Yt = Pervious period’s actual demand

Example II

In January, February’s demand for a certain car model was predicted to be 150.Actual February
demand was 166 autos. Using a smoothing constant of  = 0.20, what is the forecast for March?

Solution of Example II

New forecast (for March demand) = 150 + 0.2(166 – 150)= 153.2 or 153 autos

If actual demand in March was 146 autos, the April forecast would be:

New forecast (for April demand) = 153.2 + 0.2(146 – 153.2)= 151.76 or 152 autos

3.4.4 Measures of Forecast Accuracy

Comparison of forecasted values with actual values to see how well model works. There are
several measures available for measuring accuracy as depicted below:

Forecast error = Actual value – Forecast value

1. Mean Absolute DeviationMAD = ∑ Forecasted Errors / n

2. Mean Squared Error MSE = ∑ (Error)2 / n

Example III

The table below shows the demand for a new aftershave in a shop for each of the last 7 months.

Month 1 2 3 4 5 6 7
Demand 23 29 33 40 41 43 49

a) Calculate a two-month moving average for months two to seven. What would be your forecast
for the demand in month eight?

b) Apply exponential smoothing with a smoothing constant of 0.1 to derive a forecast for the
demand in month eight.

c) Which of the two forecasts for month eight do you prefer and why?
Solution of Example III

a) The two-month moving average for months two to seven is given by:

m2 = (23 + 29)/2 = 26.0


m3 = (29 + 33)/2 = 31.0
m4 = (33 + 40)/2 = 36.5
m5 = (40 + 41)/2 = 40.5
m6 = (41 + 43)/2 = 42.0
m7 = (43 + 49)/2 = 46.0

The forecast for month eight is just the moving average for the month before that i.e. the moving
average for month 7 = m7 = 46.

b) Applying exponential smoothing with a smoothing constant of 0.1 we get:

M1 = Y1 = 23
M2 = 0.1Y2 + 0.9M1 = 0.1(29) + 0.9(23) = 23.60
M3 = 0.1Y3 + 0.9M2 = 0.1(33) + 0.9(23.60) = 24.54
M4 = 0.1Y4 + 0.9M3 = 0.1(40) + 0.9(24.54) = 26.09
M5 = 0.1Y5 + 0.9M4 = 0.1(41) + 0.9(26.09) = 27.58
M6 = 0.1Y6 + 0.9M5 = 0.1(43) + 0.9(27.58) = 29.12
M7 = 0.1Y7 + 0.9M6 = 0.1(49) + 0.9(29.12) = 31.11

As before the forecast for month eight is just the average for month 7 = M7 = 31.11 = 31 (as we
cannot have fractional demand).

c) To compare the two forecast we calculate the mean squared deviation (MSD). If we do thiswe
find that for the moving average

• MSD = [(26.0 - 33)² + ... + (42.0 - 49)²]/5 = 41.1

and for the exponentially smoothed average with a smoothing constant of 0.1

• MSD = [(23 - 29)² + ... + (29.12 - 49)²]/6 = 203.15

Overall then we see that the two-month moving average appears to give the best one month
ahead forecasts as it has a lower MSD. Hence, we prefer the forecast of 46 that has been
produced by the two-month moving average. Same way MSE can be used to compare the results
and come up with the final decision.
Check your progress 2
1. Increase in the number of patients in the hospital due to heat stroke is:
(a) Secular trend (b) Irregular variation (c) Seasonal variation (d) Cyclical variation

2. An orderly set of data arranged in accordance with their time of occurrence is called:
(a) Arithmetic series (b) Harmonic series (c) Geometric series (d) Time series

3. A time series consists of:


(a) Short-term variations (b) Long-term variations (c) Irregular variations (d) All of the above

4. Wheat crops badly damaged on account of rains is:


(a) Cyclical movement (b) Random movement (c) Secular trend (d) Seasonal movement

5. Damages due to floods, droughts, strikes fires and political disturbances are:
(a) Trend (b) Seasonal (c) Cyclical (d) Irregular

3.5 Least Square Regression Analysis


The concept of simple linear regression analysis has been already discussed in unit 2. Here, the
only difference will be in terms of independent variable. X independent variable will be given in
time period like month, quarter, year etc.

The regression equation here is, Y = a + b * X


Where,

a = (nΣx y - ΣxΣy) / (nΣx2 - (Σx)2), b = (1/n)(Σy - a Σx)

Example IV

The sales of a company (in thousand rupees) for each year are shown in the table below.

x (year) 2005 2006 2007 2008 2009

y (sales) 12 19 29 37 45

a)Find the least square regression line y = a x + b.


b) Use the least squares regression line as a model to estimate the sales of the company in 2012.
Solution of Example IV
a) We first change the variable x into t such that t = x - 2005 and therefore t represents the
number of years after 2005. Using t instead of x makes the numbers smaller and therefore
manageable. The table of values becomes.

t (years after 2005) 0 1 2 3 4

y (sales) 12 19 29 37 45

Calculate a and b included in the least regression line formula.

t y t*y t2

0 12 0 0

1 19 19 1

2 29 58 4

3 37 111 9

4 45 180 16

Σx = 10 Σy = 142 Σxy = 368 Σx2 = 30

We now calculate a and b using the least square regression formulas for a and b.

a = (nΣt y - ΣtΣy) / (nΣt2 - (Σt)2) = (5*368 - 10*142) / (5*30 - 102) = 8.4


b = (1/n)(Σy - a Σx) = (1/5)(142 - 8.4*10) = 11.6
So, Regression Equation Y = 8.4 + 11.6 * X

b) In 2012, t = 2012 - 2005 = 7


The estimated sales in 2012 are: y = 8.4 * 7 + 11.6 = 70.4 thousands.

Given
previous
for
simple
a.
b.
c.
d.
e.
of58.9
45.5
65.5
57.1
61.0
.3,
thewhat
an
next
exponential
actual
forecast
would
period
demand
of
the
Check your progress 3be
smoothing?
58,
forecast
using
and
of 61,
ana
1. Actual demand of 103, and previous value of 99, alpha is 0.4, the next period exponential
smoothing forecast:
a)96.9
b)100.6
c)101.7
d)102

2.Given forecast error of -1, 4, 8, and -3, what is MAD?


a)2
b)3
c)7
d)4

3.If values of a = 11.8 and b = 19 when years from 2008 to 2015. Regression Equation is
_______________and forecast for 2018_____________

3.6 Application Areas of Forecasting


Forecasting can be used in supply chain management to ensure that the right product is at the
right place at the right time. Accurate forecasting will help retailers reduce excess inventory and
thus increase profit margin.Accurate forecasting will also help them meet consumer demand.
Other prominent areas can be:
• Economic forecasting
• Earthquake prediction
• Finance against risk of default via credit ratings and credit scores
• Land use forecasting
• Player and team performance in sports
• Political forecasting
• Product forecasting
• Sales forecasting
• Technology forecasting
• Telecommunications forecasting
• Transport planning and Transportation forecasting
• Weather forecasting and Flood forecasting

3.7 Let Us Sum Up

The unit mainly focus on the importance of forecasting in all our short term, medium term
and long-term planning decisions. For long term planning decisions, qualitative techniques
like technological forecasting, expert opinions through Delphi or opinion polls using
personal interviews or questionnaires. Formedium-term and short-term decisions, apart from
subjective and intuitive methods there is a wide variety of statistical techniques that could be
employed. The methods like Moving averages or exponential smoothing that are based
onpast data. Any suitable mathematical function can be fitted to the demand history by using
least squares regression. Regression is also used in estimation of parameters of causal or
econometric models.

3.8 Answers for Check your Progress

Answers to check your progress 1

1. a

2. Qualitative, Feelings/Intuition

3. Future Demand

Answers to check your progress 2

1. c

2. d

3. d

4. b

5. d

Answers to check your progress 3

1. b

2. d

3. Y = 11.8 + 19 *X, 201.8


3.9 Glossary

Forecasting: A systematic procedure to determine the future value of a variable ofinterest.

Moving Average: An average computed by considering the N most recent (for a K-period
moving average) demand points, commonly used for short term forecasting.

Prediction: A term to denote the estimate or guess of a future variable that may bearrived at by
subjective feelings or intuition.

Regression: From a given demand history to establish a relation between thedependent variable
(such as demand) and independent variable. These relations are important to plan future
demands.

Time Series: Any data on demand, sales or consumption taken at regular intervals oftime is a
time series. Analysis of this time series to discover patterns ofgrowth, demand, seasonal trends or
random fluctuations is known as Time Serieanalysis.

Causal Models: Forecasting models wherein the demand or variable or interest is related to
impact analysis or causal variables.

Delphi: A method of collecting information from experts, useful for long term forecasting. It is
iterative and maintains confidentiality to reduce subjective bias.

Exponential Smoothing: A short term forecasting method based on weightedaverages of past


data so that the weightage decreases or increases exponentially as the past data,the highest
weightage is given to the most recent data.

3.10 Assignment

1. The table below shows the demand for a particular brand of razor in a shop for each of the last
nine months.

Month 1 2 3 4 5 6 7 8 9
Demand 10 12 13 17 15 19 20 21 20

a)Calculate a three-month moving average for months three to nine. What would be your
forecast for the demand in month ten?

b)Apply exponential smoothing with a smoothing constant of 0.3 to derive a forecast for the
demand in month ten.

c) Which of the two forecasts for month ten do you prefer and why?

2. The table below shows the demand for a particular brand of fax machine in a department store
in each of the last twelve months.
Month 1 2 3 4 5 6 7 8 9 10 11 12
Demand 12 15 19 23 27 30 32 33 37 41 49 58

a) Calculate the four-month moving average for months 4 to 12. What would be your forecast for
the demand in month 13?

b) Apply exponential smoothing with a smoothing constant of 0.2 to derive a forecast for the
demand in month 13.

c) Which of the two forecasts for month 13 do you prefer and why?

3. Find the regression trend line for the following data of equity fund investment (In lakhs of
rupees per year) from 2001 to 2018.

Year Investment Year Investment


2001 45 2010 80
2002 48 2011 85
2003 52 2012 88
2004 54 2013 99
2005 57 2014 105
2006 64 2015 115
2007 66 2016 120
2008 73 2017 125
2009 78 2018 128

3.11 Activities

1. You are required to collect the data of corona cases registered and recovered from march 20,
2020 to June 20,2020. Analyze the trend between two variables. And forecast the number of new
cases for the month of July, 2020.

2. Visit a manufacturing company which is established for at least 15 years. Select any product
of the company if they are manufacturing more than one product. Collect the data of price,
production, demand, sales year wise. Now identify the change in each variable data with respect
to years passed.

3.12 Case Study

Following are the average yields of long-term new corporatebonds over a several-month period
published bythe Market Finance Department ofthe Treasury.

Month Yield Month Yield Month Yield


1 10.08 10 8.59 19 7.35
2 10.05 11 7.99 20 7.04
3 9.24 12 8.12 21 6.88
4 9.23 13 7.91 22 6.88
5 9.69 14 7.73 23 7.17
6 9.55 15 7.39 24 7.12
7 9.37 16 7.48
8 8.55 17 7.52
9 8.36 18 7.48

a) Explore trends in these data by using regression trend analysis.

b) Use a 4-month moving average to forecast values for each of the ensuing months.

c)Use simple exponential smoothing to forecast values for each of the ensuing months. Let a = .3
and then let Which weight produces better forecasts?

d) Compute MAD for the forecasts obtained in parts (b) and (c) and compare the results.

3.13 Further Reading

1. Business Statistics: For Contemporary Decision Making, by Ken Black, Wiley


Publication
2. Quantitative Techniques in Management, by N.D. Vora, McGraw hills
3. Operations Research theory and Applications, by J.K. Sharma, Macmillan
4. Operations Research, By Hamdy A Taha, Pearson Education
5. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
6. Statistics for management, Levin and Rubin, Pearson Education
7. Business Statistics, David M. levine et al, Pearson Education
8. Use of software like QM for Windows, Excel Solver
Block Summary

In this block, we learned various techniques about the vital aspect of any business that is decision
making and forecasting. The decisions taken by applying quantitative methods may be used to
achieve optimum profit or cost and perhaps it can help to forecast future also. In the first unit,
one stage and multi stage decision making techniques were explained. Decision making under
uncertainty and risk along with the decision tree approach with certainty have been discussed. In
the second unit, linear relationships between independent and dependent variables were
discussed with the help of concepts like correlation, coefficient of determination and regression
analysis. In the last unit, the forecasting techniques with various models were explained. Third
unit also covered the time series analysis and least square regression analysis.
Block Assignment

Short Answer Questions

1. Explain types of decision-making environments.


2. Differentiate between one stage and multi-stage decision making process.
3. Difference between correlation and coefficient of determination
4. Explain standard error of estimate
5. Explain types of forecast models

Long answer Questions

1. Explain all the criteria used for decision-making with uncertainty


2. Explain components of time-series analysis with examples
3. Your company is considering whether it should tender for two contracts (MS1 and MS2)
on offer from a civil construction department for the supply of certain components. The
company has three options: tender for MS1 only; or tender for MS2 only; or tender for
both MS1 and MS2.

If tenders are to be submitted the company will incur additional costs. These costs will have
to be entirely recouped from the contract price. The risk, of course, is that if a tender is
unsuccessful the company will have made a loss.

The cost of tendering for contract MS1 only is 50,000. The component supply cost if the
tender is successful would be 18,000.
The cost of tendering for contract MS2 only is 14,000. The component supply cost if the
tender is successful would be 12,000.
The cost of tendering for both contract MS1 and contract MS2 is 55,000. The component
supply cost if the tender is successful would be 24,000.

For each contract, possible tender prices have been determined. In addition, subjective
assessments have been made of the probability of getting the contract with a particular tender
price as shown below. Note here that the company can only submit one tender and cannot,
for example, submit two tenders (at different prices) for the same contract. Solve the
dilemma with decision tree approach.

Options Possible Tender Prices Probability of getting contract


MS1 150,000 0.50
MS2 80,000 0.80
MS1 and MS2 both 195,000 0.90

4. Forecast next year's sales based on changes in GDP.


Year Sales GDP
2015 100 1.00%
2016 250 1.90%
2017 275 2.40%
2018 200 2.60%
2019 300 2.90%

5. Calculate the pearson product moment correlation coefficient and regression line for the
following data:

X = Price 11 12 13 14 16 15 17

Y = Amount Demanded 40 39 43 44 38 36 46
Block Structure
_________________________________
Block 3 Linear Programming Problem and
Special problems
_________________________________
Block Introduction

Operation research is always the vital part of any industry. The agenda of doing research on
operations is maximum utilization of available resources within given restrictions. As resources
are generally scare, there is a need of learning techniques which can help in achieving maximum
profit along with minimum cost. Thus, here in this blockwe will explore some of the most
common and useful techniques of linear programming problems for two or more variables. The
first unit describes about formulation of given problem into mathematical function and then
solve it with graphical analysis to come up with the decisions. Decisions are always regarding
two objectives either maximization of profit or minimization of cost. The second unit describes
about simplex method which is used when two or more decision variables are concerned for
utilizing available resources in best possible way to maximize profit. The third unit describes
about developing transportation schedule for the shipment from one source to another
destination. In the fourth unit we will explore the assignment concept that is useful to understand
the allocation of jobs/projects to employees, workers, machines with the scientific approach.

Block Objectives
After learning this block, you will be able to:

• Formulate management problem as a linear programming problem in suitable cases


• Understand the characteristics of a linear programming problem
• Find solution of the problem by graphical analysis
• Understand different types of solutions
• Identify various applications of linear programming in business and industry.
• Discuss the principles of simplex method
• Learn the algorithm of simplex method
• Understand computational part of simplex method
• Understand the practicality of the concept with stated assumptions
• Understand the basic feasible solution of a transportation problem by various methods
• Obtain the minimum transportation cost schedule by using Modified Distribution Method
• Discuss the special cases of transportation
• Discuss the steps of learned method when problem is related to minimization
• Understand the concept and assumptions in comprehensive manner
• Learn algorithm of Hungarian assignment method
• Use the algorithm for solving an assignment problem
• Learn special cases of assignment

Block Structure

Unit 1: Linear Programming formulation and Graphical Method

Unit 2: LPP-Simplex Method

Unit 3: Transportation

Unit 4: Assignment
_________________________________
Unit No. 1 Linear Programming
formulation and Graphical Method
_________________________________
Unit Structure
1.0 Learning Objectives

1.1 Introduction
1.1Characteristics of LPP

1.2 Formulation of Linear Programming Problem (LPP)


1.2.1 Steps of Linear Programming Formulation
1.2.2 Examples of LPP Formulation
Check your progress 1

1.3 Graphical Analysis


1.3.1 Steps of Graphical Analysis
1.3.2 Example of Graphical Analysis
1.3.3 Slack and Surplus
1.3.4 Convex and Non-Convex Set
Check your progress 2

1.4 Types of constraints

1.5 Special Cases


1.5.1Multiple Optimal Solutions
1.5.2Unbounded Solution
1.5.3Infeasibility
Check your progress 3

1.6 Application Areas of Linear Programming in Business

1.7 Let Us Sum Up

1.8 Answers for Check your Progress

1.9 Glossary

1.10 Assignment

1.11 Activities
1.12 Case Study

1.13 Further Reading


1.0Learning Objectives

• Formulate management problem as a linear programming problem insuitable cases


• Understand the characteristics of a linear programmingproblem
• Find solution of the problemby graphical analysis
• Understand different types of solutions
• Identify various applications of linear programming in business andindustry.

1.1 Introduction

Linear Programming is a technique that can be applied to a variety of problems of management


such as production, advertising, transportation, supply and distribution and investment analysis.
Over the years linear Programming has been found useful not only in the field of management
but in the government, hospitals, libraries and education also. The problem has a properly
objective. The common most objectives are Maximization of profit/contribution or Minimization
of cost. Linear programming indicates the right combination of the various decision variables
which can be best used to achieve the objectives but considering practical limitations within
which the problem can be solved.

Linear Programming Problem is widely used mathematical modeling technique designed to help
managers in planning and decision making relative to resource allocation. Resources include
machinery, labor, money, time, warehouse space, raw materials etc. It is a powerful technique for
helping managerial decision making for certain kinds of problems. The basic approach is to
formulate a mathematical model called a linear programming model to represent the problem and
then to analyze this model. Any linear programming model includes basic three parts: 1.
Decision variables to represent the decision to be made, 2. Constraints to represent the
restrictions on the feasible values of these decision variables, and 3. An objective function that
expresses the overall measure of performance for the problem.

Here, only graphical method for two decision variables is presented in this unit, easy and
efficient computational procedures known as algorithms are available to solve linear
programming problems. The development of various software has been helpful to solve these
problems with a large number of decision variables and constraints.

1.1.1 Characteristics of LPP


• One objective function- maximization or minimization
• One or more constraints- that limits the degree to which the objective can be obtained
• Mathematical relationships of objectives and constraints are always linear
• Linear programming models are always deterministic in nature
• Finite choices- only positive numbers we can take
1.2 Formulation of Linear Programming Problem (LPP)

The formulation of a linear programming problem can be explained through product mix
problem. Typically, it occurs in a manufacturing industry where there is a requirement of
manufacturing variety of products with given set of resources. Each of the products has a certain
margin of profit per unit and cost per unit. These products use a common bunch of resources –
according to availability. The linear programming technique identifies the combination of the
products which will either maximize the profit or Minimize the cost without violating the
restrictions related to resources. So, the company would like to determine how many units of
each product it should produce so as to maximize overall profit or minimize overall production
cost. Basically, it involves two types of LPPs: Maximization (Profit) and Minimization (Cost).

1.2.1 Steps of Linear Programming Formulation


1. Identify objective: Maximization or Minimization.
2. Identify number of constraints and decision variables. (Note that number of constraints are
always according to resources and number of decision variables are according to products).
3. Use the decision variables to write objective function and all the constraints in form of
mathematical expressions.
4. Write non-negativity condition

1.2.2 Examples of LPP Formulation

Example I (Maximization)

The Jay Ambe Company produces two types of products tables and chairs.Processes are similar
in that both require a certain number of hours of carpentry work and in the painting
department.Each table takes 5 hours of carpentry and 2 hours of painting.Each chair requires 4 of
carpentry and 2 hours of painting.There are total 250 hours of carpentry time available and 110
hours of painting per week. Each table yields a profit of 65 Rs. and each chair a profit of 60 Rs.
Formulate this as a Linear Programming Problem.

Solution of Example I

A firm wants to determine the best combination of tables and chairs to produce to reach the
maximum profit.
Hours required to produce one unit
Department Tables (x1) Chairs (x2) Available Hours/Week
Carpentry 5 4 250
Painting 2 2 110
Profit Per Unit 65 60
The objective is to:Maximize profit
The constraints according to two resources are:
• The hours of carpentry time used cannot exceed 250 hours per week.
• The hours of painting time used cannot exceed 110 hours per week.

The decision variables according to two types of products are:


• x1 = number of tables to be produced per week.
• x2 = number of chairs to be produced per week.

Now, write the LP objective function in terms of x1 and x2:

Maximize profit Z = 65x1 + 60x2

Now, Develop mathematical relationships for the two constraints:


For carpentry, total time available is:
(4 hours per table) * (Number of tables produced)+ (3 hours per chair) * (Number of chairs
produced).

And we know that we can use total carpentry time or less than given time but not more than that
5x1 + 4x2 ≤ 250 (hours of carpentry time)

Similarly, for painting, the function is 2x1 + 2x2 ≤ 110, Both of these constraints restrict
production capacity and affect total profit.

The values for x1 and x2 must be nonnegative.


• x1 ≥ 0 (number of tables produced is greater than or equal to 0)
• x2 ≥ 0 (number of chairs produced is greater than or equal to 0)

The complete problem explained mathematically:


Maximize Profit Z = 65x1 + 60x2
subject to
5x1 + 4x2 ≤ 250 (carpentry constraint)
2x1 + 2x2 ≤ 110 (painting constraint)
x1, x2 ≥ 0 (nonnegativity constraint)

Example II (Minimization)

Afarmisengagedinbreedingpigs.Thepigsarefedonvariousproducts
grownonthefarm.Withaviewtoensuringcertainminimumnutritionforthe
growthofthepigs,twotypesoffeedsAandBarepurchasedfrom themarket.
IffeedAcostsRs.20andBRs.40perunit.Thecontentsofthesefeedsperunit,innutrientconstituentsareas
giveninthe followingtable. Formulateas LPP.
Nutrientcontent
in feeds Minimumrequiremento
Nutrient
A B f feed nutrient fora
pig

M1 12 6 108
M2 3 9 81
M3 15 10 150

Solution of Example II

The objective is to:Minimize Cost

The constraints according to three nutrient requirements are:


• The nutrient M1 requirement should be minimum 108 units
• The nutrient M2 requirement should be minimum 81 units
• The nutrient M1 requirement should be minimum 150 units

The decision variables according to two types of feeds are:


• A = number of units purchased of feed type A
• B = number of units purchased of feed type B

Now, write the LP objective function in terms of A and B:

MinimizeCost Z = 20A + 40B

Now, Develop mathematical relationships for the three constraints:


For Nutrient M1, Minimum availability is: 108
(12 Units) * (number of units purchased of feed type A)+ (6 Units) * (number of units purchased
of feed type B).

And we know that we are required to feed minimum or equal to nutrient amount but not less than
that
12A + 6B≥108 (Minimum Nutrient M1 Requirement)

Similarly, for Nutrient M2 and M3, the functionsare3A + 9B≥81 and 15A + 10B≥150
respectively.

The values for A and B must be nonnegative.


• A ≥ 0 (number of units purchased is greater than or equal to 0)
• B ≥ 0 (number of units purchased is greater than or equal to 0)
The complete problem explained mathematically:

MinimizeCost Z = 20A + 40B


subject to,
12A + 6B ≥ 108(Minimum Nutrient M1 Requirement)
3A + 9B ≥ 81 (Minimum Nutrient M1 Requirement)
15A + 10B ≥ 150 (Minimum Nutrient M1 Requirement)
A, B ≥ 0 (nonnegativity constraint)

From the above examples, we can see that with maximization type of problems constraints must
have “Less than or Equal to Sign” while in minimization type of problems constraints have
“Greater than or Equal to sign”. But Sometimes it can be a combination of both types “Less than
and Greater than types of constraints” according to availability of resources.

Check your progress 1


1. Maheshbabu has two iron mines. The production capacities of the mines are different. The
iron ore can be classified into good, mediocre and bad varieties after certain process. The owner
has decided to supply 14 or more tons of good iron, 9 or more tons of mediocre iron and 22 or
more tons of bad iron per week. The daily expense is Rs.2100 and that of the second mine is
Rs.1600. The daily production of iron mine type I for good, mediocre and bad varieties is 5, 3
and 4 respectively. And the daily production of iron mine type II for good, mediocre and bad
varieties is 2, 2 and 10 respectively. Formulate LPP.

2. State true or false


A linear programming model consists of decision variables, constraints, but no objective
function.

3. Constraints in an LP model represents


A) Limitations
B) Requirements
C) balancing limitations and requirements
D) all of above

1.3Graphical Analysis
The easiest way to solve a small LPP is graphically.The graphical method only works whenthere
are just two decision variables. When there are more than two variables, a more complex
approach is needed as it is not possible to plot the solution on a two-dimensional graph.The
graphical method provides valuable insight into how other approaches work.

1.3.1 Steps of Graphical Analysis

1. Formulate LPP that should have only two decision variables


2. Draw straight lines for every equation that includes all the constraints
3. Mark the feasible region
4. Find out co-ordinates of the vertices of feasible region
5. Calculate value of objective function at different vertices
6. Co-ordinates of the vertex at which optimal value of profit or cost (The maximum profit
that can be achieved or minimum cost that can be incurred) of objective function is
obtained offers

1.3.2Example of Graphical Analysis

Example III

Maximize Profit Z = 70x1 + 50x2


subject to
4x1 + 3x2≤240 (Drilling Constraint)
2x1 + 1x2≤100 (Milling Constraint)
X1, x2≥0 (Nonnegativity condition)

Solution of Example III

Here, x1 = number of units of product A


x2 = number of units of product B

Step 1

The first step in solving the problem is to identify a set or region of feasible solutions.To do
thiswe plot each constraint equation on a graph.

We start by graphing the equality portion of the Drilling constraint equation:


4x1 + 3x2 = 240

We solve for the axis intercepts and draw the line.

When company produces no unit of product A, the constraint is:


4(0) + 3x2 = 240
3x2 = 240
x2 = 80

Similarly, for no unit of product B, the constraint is:


4x1 + 3(0) = 240
4x1 = 240
X1 = 60

This line is shown on the following graph:


(x1 = 0, x2 = 80)

(x1 = 60, x2 = 0)

Step 2

Now, graphing the equality portion of the Milling constraint equation:


2x1 + 1x2 = 100

We solve for the axis intercepts and draw the line.

When company produces no unit of product A, the constraint is:


2(0) + x2 = 100
x2 = 100

Similarly, for no unit of product B, the constraint is:


2x1 + 1(0) = 100
1x1 = 100
X1 = 50

This line is shown on the following graph:


2

Drilling Constraint

Feasible Drilling Constraint


Region

1
4

Step 3

In above graph, there is a feasible region which means “The region which satisfies all the
constraints”. For drilling and milling constraints, maximum availability is 240 and 100
respectively so identify the common region which satisfies both the constraints. (For common
feasible region identification, consider the sign of constraint in terms of “Less than”, “Greater
Than” or “Equal To”).

Once the feasible region has been graphed, we need to find the optimal solution from the many
possible solutions. This approach is known as Corner Point Method. It involves looking at the
profit at every corner point of the feasible region.The mathematical theory behind LP is that the
optimal solution must lie at one of the corner points, or extreme point, in the feasible region.For
this example, the feasible region is a four-sided polygon with four corner points labeled 1, 2, 3,
and 4 on the graph.

To find the coordinates for Pointaccurately we have to solve for the intersection of the two
constraint lines.Using the simultaneous equations method, we multiply the Milling equation by –
2 and add it to the Drilling equation

4x1 + 3x2= 240 (Drilling line)


– 4x1 – 2x2 = –200 (Milling line)
X2 = 40
Substituting 40 for x2 in either of the original equations allows us to determine the value of x1.

4x1 + (3)(40) = 240 (Drilling line)


4x1 + 120 = 240
X1 = 30

Find the final solution by putting all x1 and x2 values in objective function.

Points X1 X2 Maximize Z = 70x1 + 50x2


1 0 0 0
2 0 80 4000
3 30 40 4100
4 50 0 3500

Because Point 3 returns the highest profit, this is the optimal solution.

1.3.3 Slack and Surplus

Slack is the amount of a resource that is not used. For a less-than-or-equal constraint:

Slack = Amount of resource available – amount of resource used.

In Example 3, Optimal solution is x1=30 and x2=40, put these values in 4x1 + 3x2 = 240
4(30) + 3(40) = 240, here LHS = RSH So No Slack

In Example 3, Optimal solution is x1=30 and x2=40, put these values in 2x1 + 1x2 = 100
2(30) + 1(40) = 100, here LHS = RSH So No Slack, No Surplus

In both the constraints, full utilization of resources has one so no slack.

Surplus is used with a greater-than-or-equal constraint to indicate the amount by which the
right-hand side of the constraint is exceeded.

Surplus = Actual amount – minimum amount.

For Example, If actual amount is 240 but minimum requirement is 160 only so you will have
remaining value of (240 – 160 = 40) as surplus with you.

1.3.4Convex and Non-Convex Set

If any two points are selected in the region and the line segment formed by joining these two
points lies completely on the boundary of the feasible region then it is a Convex Set
i.e. Feasible region is always convex set

If any two points are selected in the region and the line segment formed by joining these
twopoints do not lie on the boundary of the feasible region then it is a Non-Convex Set
Check your progress 2
1. A feasible solution of LPP
A) Must satisfy all the constraints simultaneously
B) Need not satisfy all the constraints, only some of them
C) Must be a corner point of the feasible region
D) all of the above

2. The objective function for a L.P model is 3x1+2x2, if x1=20 and x2=30, what is the value of
the objective function?
A) 0
B) 50
C) 60
D) 120

3. The graphical method can only be used when there are _____ decision variables

4. The __________ is that region which satisfies all constraints.

1.4Types of constraints

1. Binding Constraints: If in the constraints LHS = RHS when optimal values of the
decision variables are substituted into the constraints then those constraints are binding
constraints
2. Non - Binding Constraints: If in the constraints LHS ≠ RHS when optimal values of the
decision variables are substituted into the constraints then those constraints are Non-
binding constraint
3. Redundant Constraints: When a constraint, when plotted, does not form part of the
boundary marking the feasible region of the problem, it is said to be Redundant
It does not affect the optimal solution to the problem

1.5Special Cases

1.5.1 Multiple Optimal Solutions: A solution which have similar values of profits or costs so
not unique but more than one optimal solution are possible.
Example IV

Max Z = 4x1 + 3x2

Subject to
4x1+ 3x2 ≤ 24
x1 ≤ 4.5
x2 ≤ 6
x1 ≥ 0 , x2 ≥ 0

Solution of Example IV

The first constraint 4x1+ 3x2 ≤ 24, written in a form of


equation 4x1+ 3x2 = 24
Put x1 =0, then x2 = 8
Put x2 =0, then x1 = 6
The coordinates are (0, 8) and (6, 0)
The second constraint x1 ≤ 4.5, written in a form of
equation x1 = 4.5
The third constraint x2 ≤ 6, written in a form of
equation x2 = 6

The corner points of feasible region are A, B, C and D. So the coordinates for the corner points
are
A (0, 6)
B (1.5, 6) (Solve the two equations 4x1+ 3x2 = 24 and x2 = 6 to get the coordinates)
C (4.5, 2) (Solve the two equations 4x1+ 3x2 = 24 and x1 = 4.5 to get the coordinates)
D (4.5, 0)
We know that Max Z = 4x1 + 3x2
At A (0, 6)
Z = 4(0) + 3(6) = 18
At B (1.5, 6)
Z = 4(1.5) + 3(6) = 24

At C (4.5, 2)
Z = 4(4.5) + 3(2) = 24

At D (4.5, 0)
Z = 4(4.5) + 3(0) = 18

Max Z = 24, which is achieved at both B and C corner points. It can be achieved not only at B
and C but every point between B and C. Hence the given problem has multiple optimal
solutions.

1.5.2 Unbounded Solution: A solution which increases or decreases the value of objective
function of the LP problem indefinitely is called unbounded solution. Generally, when
maximization type of problem with all constraints have “greater than or equal to sign”. Then
there is no limit to go up to upper side.

Example V

Max Z = 3x1 + 5x2


Subject to
2x1+ x2 ≥ 7
x1+ x2 ≥ 6
x1+ 3x2 ≥ 9
x1 ≥ 0 , x2 ≥ 0

Solution of Example V

The first constraint 2x1+ x2 ≥ 7, written in a form of equation


2x1+ x2 = 7
Put x1 =0, then x2 = 7
Put x2 =0, then x1 = 3.5
The coordinates are (0, 7) and (3.5, 0)

The second constraint x1+ x2 ≥ 6, written in a form of equation


x1+ x2 = 6
Put x1 =0, then x2 = 6
Put x2 =0, then x1 = 6
The coordinates are (0, 6) and (6, 0)

The third constraint x1+ 3x2 ≥ 9, written in a form of equation


x1+ 3x2 = 9
Put x1 =0, then x2 = 3
Put x2 =0, then x1 = 9
The coordinates are (0, 3) and (9, 0)
The corner points of feasible region are A, B, C and D. So the coordinates for the corner points
are

A (0, 7)
B (1, 5) (Solve the two equations 2x1+ x2 = 7 and x1+ x2 = 6 to get the coordinates)
C (4.5, 1.5) (Solve the two equations x1+ x2 = 6 and x1+ 3x2 = 9 to get the coordinates)
D (9, 0)
We know that Max Z = 3x1 + 5x2
At A (0, 7)
Z = 3(0) + 5(7) = 35

At B (1, 5)
Z = 3(1) + 5(5) = 28

At C (4.5, 1.5)
Z = 3(4.5) + 5(1.5) = 21

At D (9, 0)
Z = 3(9) + 5(0) = 27

The values of objective function at corner points are 35, 28, 21 and 27. But there exists infinite
number of points in the feasible region which is unbounded. The value of objective function will
be more than the value of these four corner points i.e. the maximum value of the objective
function occurs at a point at ∞. Hence the given problem has unbounded solution.

1.5.3 Infeasibility: The set of values of decision variables which do not satisfy all the
constraints and non-negativity conditions of an LP problem simultaneously is said to constitute
the infeasible solution to that linear programming problem. In common, when it is not possible to
find common region that satisfies all constraints simultaneously.

Example VI

Max Z = 3x1 + 2x2

Subject to
x1+ x2 ≤ 1
x1+ x2 ≥ 3
x1 ≥ 0 , x2 ≥ 0

Solution of Example VI

The first constraint x1+ x2 ≤ 1, written in a form of


equation x1+ x2 = 1
Put x1 =0, then x2 = 1
Put x2 =0, then x1 = 1
The coordinates are (0, 1) and (1, 0)
The first constraint x1+ x2 ≥ 3, written in a form of
equation x1+ x2 = 3
Put x1 =0, then x2 = 3
Put x2 =0, then x1 = 3
The coordinates are (0, 3) and (3, 0)

There is no common feasible region generated by two constraints together i.e. we cannot identify
even a single point satisfying the constraints. Hence there is no optimal solution.

Basic difference between infeasibility and unboundedness for maximization type of


problem is:
• Infeasibility: Not a single Solution
• Unboundedness: Infinitive feasible solutions but none of them can be termed as
the optimal
Check your progress 3
1. Transfer the values of optimal solution in one of the constraints and the result is LHS = RHS,
that means it is_________________ constraint.

2. Following is not the special case of LPP graphical method.


A) Multiple optimal solutions
B) Infeasibility
C) Unboundedness
D) Divisibility

3. State true of false


When a constraint, when plotted, does not form part of the boundary marking the feasible region
of the problem is known redundant constraint.

4. Infinitive feasible solutions but none of them can be termed as an optimal solution is
known as ______________ special case of LPP.

5. If one or more optimal solutions have same value as maximum profit or minimum cost is
termed as ________________________

1.6 Application Areas of Linear Programming in Business

Marketing Research / Consumer Research: To minimize the cost of research according to the
constraints

Media Selection: The objective can be to Maximize Audience Exposure or to Minimize


Advertising Cost

Production Mix: Number of units of production for one or more different products for
Maximizing the profit or Minimizing the cost

Labor Scheduling: Number of hours allocation to Each Labor Work

Production Scheduling: Setting a low-cost production schedule over a period of weeks or


months considering the factors like Labor Capacity, Inventory and Storage Costs, Space
Limitations, Product Demand, Labor Relations etc.

Shipping and Transportation: Minimizing cost

Ingredient Mix: Ingredient Mixing proportion decision for making one or more products

Financial Portfolio Selection: Maximizing return on investment subject to a set of risk factors
1.7Let Us Sum Up

In this unit, we started with the general introduction of linear programming problem followed by
identification of thedecisionvariableswhichare
withsomeeconomicorphysicalquantitieswhosevaluesareof major interest to the management. The
problem must have a well-defined objective function expressed in terms of the decision
variables. The objective function must be maximized when it expresses the profit or contribution.
In case the objective function indicatesacost,itmust beminimized. When a problem of
management is expressed in terms of the mathematical function by using decision variables with
appropriate objective function andconstraints, the problem has been formulated. A linear
programming problem with only two decision variables can be solved graphically. Any non-
negative solution which satisfies all the constraints is known as a feasible solution of the
problem. The common region which satisfies all the constraints is known as a feasible region.
The value of the decision variables which maximize or minimize the objective function is located
on the extreme point of the convex set (Feasible Region) formed by the feasible solutions. From
the all feasible solutions, there can be one or more optimal solutions. Sometimes the problem
may be infeasible indicating that no feasible solution of the problem exists. Sometimes there is
no boundary to form the convex set and thus number of multiple optimal solutions can be
considered but none of them can be termed as an optimal solution. The different applicability of
linear programming is also discussed in this unit.

1.8Answers for Check your Progress

Answers to check your progress 1

1. Minimize Cost Z = 2100x1 + 1600x2

Subject to,
5x1 + 2x2 ≥ 14 (Good Iron Constraint)
3x1 + 2x2 ≥ 9 (Mediocre Iron Constraint)
4x1 + 10x2 ≥ 22 (Bad Iron Constraint)
x1, x2 ≥ 0 (Non-negativity constraint)

2. False

3. A

Answers to check your progress 2

1. A

2. D

3. Two

4. Feasible
Answers to check your progress 3

1. Binding

2. D

3. True

4. Unboundedness

5. Multiple Optimal Solutions

1.9 Glossary

Decision Variables: are economic or physical quantities whose numerical values indicate the
solution of the linear programming problem.

The Objective Function: of a linear programming problem is a linear function of the decision
variables expressing the objective of the decision maker.

Constraints: of a linear programming problem are linear equations or inequalities arising out of
practical limitations.

A Feasible Solution: of a linear programming problem is a solution which satisfies all the
constraints including the non-negativity constraints.

The Feasible Region: is the collection of all feasible solutions.

A Redundant Constraint: is a constraint which does not affect the feasible region.

A Convex Set: is a collection of points such that for any two points on the set, the line joining
the points belongs to the set.

Non-Convex Set:If any two points are selected in the region and the line segment formed by
joining these twopoints do not lie on the boundary of the feasible region.

Multiple Solutions: of a linear programming problem are solutions each of which maximize or
minimize the objective function.

Unbounded Solution: of a linear program problem is a solution whose objective function is


infinite.

Infeasible Solution: Linear Programming Problem has no feasible solution.


1.10 Assignment

1. A retired person wants to invest up to an amount of Rs. 30,000 in fixed income securities. His
broker recommends investing in tow bonds: Bonds A yielding 7% and Bond B yielding 10%.
After some consideration, he decides to invest at most Rs. 12,000 in Bond A and least Rs. 6000
in Bond A. He also wants the amount invested in Bond A to be at least equal to the amount
invested in Bond What should the broker recommended if the investor wants to maximize his
return on investment? Solve graphically.

2. A firm manufactures two products TV & DVD player which must be processed through two
processes, Assembly and Finishing. Assembly 90 hours available and finishing has 82 hours
available. For 1 TV set requires 5 hours in assembly and 3 hours in finishing while 1 DVD
player set requires 6 hours in assembly and 4 hours in finishing. If profit is Rs. 900 per TV and
Rs. 600 per DVD player set, find out the best combination of TV and DVD player set to realize a
maximum profit.

3. A rubber company is engaged in producing three different types of tyres A, B, and C. The
company has two production plants to produce these. In a normal eight hour working day, plant I
produces 100, 200 and 200 tyres of types A, B and C respectively. Plant II produces 120, 120,
and 400 tyres of type A,B, and C respectively. The monthly demand of A, B, and C is 5000,
6000 and 14000 units respectively. The daily cost of operation of plants I and II are Rs. 5000 and
Rs. 7000 respectively. Find the minimum number of days to operation per month at two different
plants to minimize the total cost while meeting the demand using graphical method.

4. Find the graphical solution of the following problem.


Find x and y so as to

Minimize Z = X + Y subject to the following constraints;


5X + 10Y ≤ 50 ,
X+Y≥1,
Y≤4,
X,Y≥0.

Observe the solution and comment on it.

5. A firm uses lathes, milling and grinding machines to produce two parts. Following table
represents the machining times required for each part, available machine time on different
machines and the profit values:

Machine Type Required Machine time (Min) Maximum time


Part - I Part – II available per
week (Min)
Lathes 12 6 3000
Milling Machines 4 10 2000
Grinding Machines 2 3 900
Profit per Unit (Rs) 40/- 100/-
1.11 Activity

Visit a manufacturing company, collect the data regarding any two types of products they
produce which use common any number of resources, cost or profit per unit of product,
minimum or maximum availability of resources, number of hours or kgs etc. require to produce
one unit of product. Then prepare a table of the information, formulate as LPP and solve
graphically to identify optimal cost or profit

1.12 Case Study


SupposeMr.Deshmukh is a production manager in a manufacturing company. He has the
problem of deciding optimal product mix for the next month. The company manufactures two
products Resistors and Capacitors which yield
unitcontributionofRs.100andRs.40respectively.The company has three facilities (resources) with
availability of 1000 kg of raw material & 900 hrs on machine for the next month. Also 5 workers
can work for 5 hrs a day for 20 days in coming month. It is known that there is sufficient demand
of the products so that all the units produced will be sold away. Mr. Pavan Kumar collected the
relevant data carefully and wants to solve the problem as Linear Programming model. The
relevant data is as shown in the following table:

1)Solve the problem using Graphical to determine the optimum product mix of capacitors and
resistors for the next month. Also, determiner corresponding optimum achievable profit due to
sells of Resistors and Capacitors. Which facilities are fully utilized and which resources are left
unused at the optimal stage?

2)Are there alternate (multiple) optimal solutions available to Mr. Pavan Kumar? If so, suggest
another solution.

Resources Product Resource


Availability
Resistors Capacitors
Raw Material 5 2 1000 Kg
Machine Capacity 1 2 900 Hours
Workers Availability 1 2 500 Hours
Profit (Rs.) 100 40

1.13 Further Reading

1. Quantitative Techniques in Management, by N.D. Vora, McGraw hills


2. Operations Research theory and Applications, by J.K. Sharma, Macmillan
3. Operations Research, By Hamdy, A Taha, Pearson Education
4. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
5. Use of software like QM for Windows, Excel Solver
__________________________________________________________________________
Unit No. 2Linear Programming Problem –
Simplex Method
_________________________________
Unit Structure
2.0 Learning Objectives

2.1 Introduction

2.2 Simplex Method


2.2.1 Algorithm of simplex method
2.2.2 Principles of simplex method
2.2.3 Computational part of simplex method
Check your progress

2.3 Let Us Sum Up

2.4 Answers for Check your Progress

2.5 Glossary

2.6 Assignment

2.7 Activities

2.8 Case Study

2.9 Further Reading


2.0 Learning Objectives

• Discuss the principles of simplex method

• Learn the algorithm of simplex method

• Understand computational part of simplex method

2.1 Introduction

The graphical method that you have learned in block 2 unit 1 of solving linear programming
problem is a vitalhelp to understand basic structure of problem, the method has limited
application in industrial problems as the number of variables occurring are always substantially
large.A more useful method known as Simplex Method is suitable for solving linear
programming problems with a larger number of variables means more than or equal two
variables. The method through an iterative process progressively approaches and finally reaches
to the maximum or minimum value of the objective function. The method also helps the decision
maker to identify the redundant constraints, an unbounded solution, multiple solution and an
infeasible problem.

In industrial or business applications of linear programming, the coefficients of the objective


function and the right-hand side of the constraints are generally known with complete certainty.
However, in large number of problems the uncertainty is so high that the effect of inaccurate
coefficients can be main element. The effect of changes in the coefficients in the maximumor
minimum value of the objective function can be studied through a technique known as
Sensitivity Analysis.

Every linear programming problem has a dual problem associated with it. The solution of this
problem is readily obtained from the solution of the original problem if simplex method is used
for this purpose. The variables of dual problem are known as dual variables or shadow price of
the. various resources. The solution of the dual problem can be used by the decision maker for
augmenting the resources.

2.2 Simplex Method

Simplex method was developed by G. Danztig in 1947. The simplex method provides an
algorithmwhich is based on the fundamental theorem of linear programming. The Simplex
algorithm is an iterative procedure for solving LP problems in a finite number of steps. It
consists of following:
• Having a trial basic feasible solution to constraint-equations
• Testing whether it is an optimal solution
• Improving the first trial solution by a set of rules and repeating the process till an optimal
solution is obtained
2.2.1 Algorithm of simplex method

To solve a linear programming problem in standard form, use the following steps.

1. Convert each inequality in the set of constraints to an equation by adding slack variables.

2. Create the initial simplex tableau and Calculation of Z andj and test the basic feasible solution
for optimality.

3. This step is to improve the basic feasible solution, the vector entering the basis matrix and the
vector to be removed from the basis matrix are determined. Locate the highest negative entry in
the bottom row. The column for this entry is called the entering column. (If ties occur, any of the
tied entries can be used to determine the entering column.). Now find minimum ratio considering
column respective to incoming variable. Select the minimum value as outgoing variable from
minimum ratio. (If negative minimum ratio then never considers it). Intersection point of
incoming variable column and outgoing variable row is selected.

4. Mark the key element at the intersection of incoming and outgoing variable. divide all the
elements of that row by the key element.Then subtract appropriate multiples of this new row
from the remaining rows, so as to obtain zeroes in the remaining position of the respective
column.

5. Repeat step 3 to 4 until an optimal solution is obtained.

6. If all entries in the bottom row are zero or positive, this is the final tableau.

2.2.2 Principles of simplex method

Consider following example of linear programming problem to understand simplex method basic
principles. In simplex method the objective function is to be maximized always, not
minimized.

Maximize Z = 4x1 + 6x2

Subject to,
-x1 + x2 ≤ 11
x1 + x2 ≤ 27
2x1 + 5x2≤ 90
x1, x2 ≥ 0

Since the left-hand side of each inequality is less than or equal to the right-hand side, there must
exist nonnegative numbers and that can be added to the left side of each equation to produce the
following system of linear equations. The numbers and are called slack variables because they
take up the “slack” in each inequality. Remember that slack variables are counted only for
constraints not for objective function.

Maximize Z = 4x1 + 6x2 + 0s1 + 0s2

Subject to,

-x1 + x2 + s1 = 11
x1 + x2 + s2 = 27
2x1 + 5x2 + s3 = 90

A basic solution of a linear programming problem in standard form is a solution of the constraint
equations in which at most m variables are nonzero. The variables that are nonzero are called
basic variables. A basic solution for which all variables are nonnegative (positive) is called a
basic feasible solution.

Procedure to test the basic feasible solution for optimality by the rules given:
Rule 1: If all j ≥ 0, the solution under the test will be optimal. Alternate optimal solution will
exist if any non-basic j is also zero.
Rule 2: If atleast one j is negative, the solution is not optimal and then proceeds to improve the
solution in the next step.

2.2.3 Computational part of simplex method

Example I

Maximize Z = 3x1 + 2x2

Subject to
x1 + x2 ≤ 4
x1 – x2 ≤ 2
and x1 ≥ 0, x2 ≥ 0

Solution of Example I

1. Convert each inequality in the set of constraints to an equation by adding slack variables.

Maximize Z = 3x1 + 2x2 + 0s1 + + 0s2

Subject to
x1 + x2+ s1= 4
x1 – x2 + s2= 2
x1 ≥ 0, x2 ≥ 0, s1 ≥ 0, s2 ≥ 0

2. Create the initial simplex tableau and Calculation of Z andj and test the basic feasible solution
for optimality.

The simplex method is carried out by performing elementary row operations on a matrix that we
call the simplex tableau. This tableau consists of thematrix corresponding to the coefficients of
constraints together with the coefficients of the objective function written in the specific form in
the tableau. Objective function values at the initial simplex tableau are always considered
negative.

Basic CB x1 x2 s1 s2 XB = RHS Minimum Ratio


Variables of = XB/Xk
constraint
s1 0 1 1 1 0 4
s2 0 1 -1 0 1 2
Z= -3 -2 0 0
(CB*
XB) = 0

Calculation of Z and j and test the basic feasible solution for optimality by the rules
given.
For, Z= CB XB = (0 *4 + 0 * 2) = 0

For below points Cij = coefficients of objective function for x1, x2, s1, s2.

x1 = CB X1 – Cj =( 0 * 1 + 0 * 1) – 3 = -3
x2 = CB X2 – Cj =( 0 * 1 + 0 * -1) – 2 = -2
s1 = CB X3 – Cj = (0 * 1 + 0 * 0) – 0 = 0
s2 = CB X4 – Cj = (0 * 0 + 0 * 1) – 0 = 0

In this problem it is observed that there are negative values -3 and -2. Hence proceed to improve
this solution.

3.This step is to improve the basic feasible solution, the vector entering the basis matrix and
thevector to be removed from the basis matrix are determined. Locate the highest negative entry
in the bottom row. The column for this entry is called theentering column. (If ties occur, any of
the tied entries can be used to determine the enteringcolumn.). Now find minimum ratio
considering column respective to incoming variable. Select the minimum value as outgoing
variable from minimum ratio. (If negative minimum ratio then never considers it). Intersection
point of incoming variable column and outgoing variable row is selected.

Basic CB x1 = Xk x2 s1 s2 XB = RHS Minimum


Variables of Ratio = XB/Xk
constraint
s1 0 1 1 1 0 4 4/1 = 4
s2 0 1= -1 0 1 2 2/1 = 2
Intersection Outgoing
Point = Key Variable
Element
Z= -3 = -2 0 0
(CB* Incoming
XB) = 0

4. Mark the key element at the intersection of incoming and outgoing variable. divide all the
elements of that row by the key element.Then subtract appropriate multiples of this new row
from the remaining rows, so as to obtain zeroes in the remaining position of the column Xk.

Here key element is 1, so divide respective second row by value 1. Related calculation is shown
below.

Use (R1=R1 – R2) for first row calculation that is 1-1 = 0, 1-(-1)=2, 1-0 = 1, 0-1 = -1, 4-2 = 2
respectively.

Basic CB x1 x2 s1 s2 XB = RHS Minimum


Variables of Ratio =
constraint XB/Xk
s1 0 0 2= 1 -1 2 2/2 = 1
Intersection Outgoing
point=key Variable
element
x1 3 1 -1 0 1 2 2/-1 = -2
(Neglect in
case of
negative)
Z=0*2+3*2= 0 -5 0 3
6 =incoming

6, 0, -5, 0, 3 are calculated as explained in step 2. Still one value is negative -5, so this is not an
optimal solution.

5. Repeat step 3 to 4 until an optimal solution is obtained.

Basic CB x1 x2 s1 s2 XB = RHS Minimum


Variables of Ratio = XB/Xk
constraint
x2 2 0 1 1/2 -1/2 1
x1 3 1 0 1/2 1/2 3

Z= 11 0 0 5/2 1/2
6. If all entries in the bottom row are zero or positive, this is the final tableau. The variables
which has value is known as non-basic variables.
As all the values are positive, this is an optimal solution, XB values are the solution so answer is
X1 = 3 and X2 = 1, thus maximum profit Z = 3*x1 + 2*x2 = 11

Example II

Maximize Z = 80x1 + 55x2


Subject to
4x1 + 2x2 ≤ 40
2x1 + 4x2 ≤ 32
and x1 ≥ 0, x2 ≥ 0

Solution of Example II
Maximize Z = 80x1 + 55x2 + 0s1 + 0s2
Subject to
4x1 + 2x2+ s1= 40
2x1 + 4x2 + s2= 32
x1 ≥ 0, x2 ≥ 0, s1 ≥ 0, s2 ≥ 0

Cj → 80 55 0 0
Basic CB XB X1 X2 S1 S2 Min ratio
Variables XB /Xk
s1 0 40 4 2 1 0 40 / 4 = 10→ outgoing

s2 0 32 2 4 0 1 32 / 2 = 16
↑incoming
Z= CB XB = 0 -80 -55 0 0

x1 80 10 1 ½ 1/4 0 10/1/2 = 20

s2 0 12 0 3 -1/2 1 12/3 = 4→ outgoing

↑incoming
Z = 800 0 -15 40 0

x1 80 8 1 0 1/3 -1/6

x2 55 4 0 1 -1/6 1/3

Z = 860 0 0 35/2 5
Answer is X1= 8 and x2 = 4, so Z = 860
Check your progress 1
1. In the simplex method, a tableau is optimal only if all the Z values at the end of the solution:
(a) zero or negative.
(b) zero.
(c)negative and nonzero.
(d) positive and zero.

2. Linear programming problem involving more than two variables can be solved by:

(a) Simplex method


(b) Graphical method
(c) Matrix minima method
(d) None of these

3. Maximize Z = 3x1 + 2x2 + 5x3

Subject to,

x1 + 2x2 + 2x3 ≤ 8
3x1 + 2x2 + 6x3 ≤ 12
2x1 + 3x2 + 4x3 ≤ 12
x1, x2, x3 ≥ 0

2.3 Let Us Sum Up

The simplex method is the appropriate method for solving a linear programming problem with
more than two decision variables. For less than or equal to type constraints slack variables are
introduced to make inequalities equations. A type of solution known as a basic feasible solution
is important for simplex computation. A basic feasible solution of a system with m equations and
n variables has m non negative variables known as basic variables and n-m variables with value
zero known as non-basic variables. It can always find a basic feasible solution with the help of
the slack variables. The objective function is maximized at one of the basic feasible solutions.
Starting with the initial basic feasible solution obtained from the slack variables the simplex
method improves the value of the objective function step by step by bringing in a new basic
variable and making one of the present basic variables non basic. The selection of the new basic
variable and the omission of a current basic variable are performed following certain rules so that
the revised basic feasible solution improves the value of the objective function. The iterative
procedure stops when it is no longer possible to obtain a better value of the objective function
than the present one. The existing basic feasible solution is the optimum solution of the problem
which maximizes objective function.
2.4 Answers for Check your Progress

Answers to check your progress

1. d

2. a

3. Z = 12 where x1 = 4, x2 = 3

2.5 Glossary

A Slack Variable:corresponding to a less than or equal to type constraint is a non-negative


variable introduced to convert the constraint into an equation.
Basic Feasible Solution: of a system of m equations and n variables is a solution where m
variables are non-negative and n-m variables are zero.
A Basic Variable:of a basic feasible solution has a non-negative value.
A Non-Basic Variable:of a basic feasible solution has a value equal to zero.
The Optimum Solution:of a linear programming problem is the solution where theobjective
function is maximized or minimized.

2.6 Assignment

1. Solve the following LP problem using simplex method.

Maximize z = 3x1 + 2x2 + 5x3

Subject to
x1 + 2x2 + x3 ≤ 430
3x1 + 2x3 ≤ 460
x1 + 4x2 ≤ 420
x1, x2, x3 ≥ 0

2. A manufacturer of bags makes three types of bags P, Q and R which are processed on three
machines M1, M2 and M3. Bag P requires 2 hours on machine M1 and 3 hours on machine M2
and 2 hours on machine M3. bag Q requires 3 hours on machine M1, 2 hours on machine M2
and 2 hours on machine M3 and Bag R requires 5 hours on machine M2 and 4 hours on machine
M3. There are 8 hours of time per day available on machine M1, 10 hours of time per day
available on machine M2 and 15 hours of time per day available on machine M3. The profit
gained from bag P is Rs 3.00 per unit, from bag Q is Rs 5.00 per unit and from bag R is Rs 4.00
per unit. what should be the daily production of each type of bag so that the products yield the
maximum profit?
3. Use the simplex method solve the following LPP problem:

Max Z = 30x + 40y +20z


subject to
10x + 12y + 7z ≤ 10,000
7x +10y + 8z ≤ 8,000
x + y + z ≤ 1,000
x, y, z ≥ 0

4. Comment on the solution obtained by simplex method of the following LP problem:


Max Z = 3x1 + 2x2 + 3x3

Subject to
2x1 + x2 + x3 ≤ 2
3x1 + 4x2 + 2x3 ≤ 8
x1, x2, x3 ≥ 0

5. Solve the following LPP by Simplex Method:


Maximize z = x1 + x2
Subject to x1 + 2x2 ≤ 2000
x1 + x2 ≤ 1500
x2 ≤ 600
x1, x2 ≥ 0

2.7 Activities

1. Solve by simplex method: Max z = 3x1 + 5x2 + 4x3

Subject to
2x1 + 3x2 ≤ 8
2x2 + 5x3 ≤ 10
3x1 + 2x2 +4x3 ≤15
x1, x2, x3 ≥ 0

2. The products A. B and C are produced in three machine centers X, Y and Z. Each product
involves operation of each of the machine centers. The time required for each operation for unit
amount of each product is given below. 100, 77 and 80 hours are available at machine centers X,
Y and Z respectively. The profit per unit of A, B and C is Rs. 12, Rs. 3 and Rs. 1
respectively.Find out a suitable product mix so as to maximize the profit.

Maximize Z = 12x1 + 3x2 + x3


Subject to,
10x1 + 2x2 + x3 ≤ 100
7x1 + 3x2 + 2x3 ≤ 77
2x1 + 4x2 + x3 ≤ 80
x1, x2, x3 ≥ 0
2.8 Case Study

A manufacturer of three products tries to follow a policy of producing those which continue most
to fixed cost and profit. However, there is also a policy of recognizing certain minimum sales
requirements currently, these are: Product: x1, x2, and x3. There are three producing
departments. The production times in hour per unit in each department and the total times
available each week in each department are given in the table. The contribution per unit of
product x1, x2, x3 is Rs. 10.50, Rs. 9.00 and Rs. 8.00 respectively. Solve by simplex method.

Departments Time required for production Total hours available


x1 x2 x3
1 0.25 0.20 0.15 420
2 0.30 0.40 0.50 1048
3 0.25 0.30 0.25 529

2.9 Further Reading

1. Quantitative Techniques in Management, by N.D. Vora, McGraw hills


2. Operations Research theory and Applications, by J.K. Sharma, Macmillan
3. Operations Research, By,Hamdy A Taha, Pearson Education
4. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
5. Use of software like QM for Windows, Excel Solver
Unit No. 3Transportation
_________________________________
Unit Structure
3.0Learning Objectives

3.1Introduction
3.1.1Basic Structure of Transportation

3.2 Initial Basic Feasible Solution of a transportation problem


3.2.1 North-West Corner Method (NWM)
3.2.2 Least Cost Method (LCM)
3.2.3 Vogel’s Approximation Method (VAM)
Check your progress 1

3.3 Optimal Solution Method – Modified Distribution Method (MODI Method)


Check your progress 2

3.4 Special Cases of Transportation


3.4.1Unbalanced Transportation Problem
3.4.2Multiple Optimal Solutions
3.4.3Degeneracy
Check your progress 3

3.5Let Us Sum Up

3.6Answers for Check your Progress

3.7Glossary

3.8Assignment

3.9Activities

3.10Case Study

3.11 Further Reading


3.0 Learning Objectives

• Understand the practicality of the concept with stated assumptions


• Understand the basic feasible solution of a transportation problem by various methods
• Obtain the minimum transportation cost schedule by using Modified DistributionMethod
(MODI Method)
• Discuss the special cases of transportation
• Discuss the steps of learned method when problem is related to minimization

3.1 Introduction
The transportation problem deals with the distribution of goods from several points of
supply(sources/Origins) to a number of points of demand (destinations).Usuallywe are given the
capacity ofgoods at each source and the requirements at each destination. Basically, the objective
is to minimize total transportation and production costs. Sometimes we deal with maximization
of profit also. This is an iterative procedure in which a solution to a transportation problem is
foundand evaluated using a special procedure to determine whether the solution is optimal.When
the solution is optimal, the process stops.If not, then a new solution is generated. Basic Structure
of a transportation problem isdiscussed with the help of the following example.

3.1.1 Basic Structure of Transportation

Source P Q R S Supply
A 40 45 35 36 300
B 48 50 52 46 200
C 43 44 55 50 400
D 44 50 40 30 400
Demand 250 300 350 400 1300

Consider a manufacturer who operates four factories (Sources) and dispatches his products to
four different retail shops (Destinations). The Table above indicates the capacities (Supply) of
the four factories, the quantity of products required (Demand) at the various retail shops and the
cost of shipping one unit of the product from each of four factories to each of the four retail
shops.

The Table usually referred to as Transportation Table provides the basic data regarding the
transportation problem. The capacity of factories A, B, C, and D is 300, 200, 400, and 400
respectively. The requirement at retail shops P, Q, R, and S is250, 300, 350, and 400
respectively. The prices inside the intersecting cells (Cell AP – Per Unit Transportation cost from
Source A to Destination P) are known as unit transportation cost. So, the cost of transportation of
one unit from SourceA to retail shop P is 40 Rs., FactoryA to retail shop Q is 45 Rs. and so on.

3.2 Initial Basic Feasible Solution of a Transportation Problem


In general, any basic feasible solution of a transportation problem with m origins (such as
factories) and n destinations (such as retail shops) starts with the vital condition check of
SUPPLY=DEMAND which is also known as rim requirement of transportation problem
(Balanced Transportation Problem). The following methods are available for the calculation of
an initial basic feasible solution. All the three methods have been explained using Example I.

Example I

Source Destination Supply


P Q R S
A 15 18 22 16 30
B 15 19 20 14 40
C 13 16 23 17 30
Demand 20 20 25 35 100

Solutions of Example I by following three methods

3.2.1 North-West Corner Method (NWM)


1 First check supply and demand if it is equal, go to step 2 or add dummy row with zero cost in
each cell if supply is less and add dummy column with zero cost in each cell if demand is
less.
2 Start in the upper left-hand cell and allocate units to shipping routes as follows:
3 Exhaust the supply (factory capacity) of each row before moving down to the next row.
4 Exhaust the demand (warehouse) requirements of each column before moving to the next
column to the right.
5 Check that all supply and demand requirements are met.

Solution

Here supply = demand = 100 so go ahead with step 2. First, start with the cell on intersection of
A and P. The row total corresponding to this is 30 and column total at destination P is 20. So,
allocate 20 which is minimum out of two at AP and remaining units are 10 at source A. At the
destination P, requirement has been satisfied so eliminate column P, move horizontally to the cell
AQ. With the supply available at source A being 10 and the demand at Q being 20, allocate
minimum out of two which is 10 at AQ and move further horizontally to cell AR. As no supply
is available at source A, move further to directly to cell BQ where 10 units are left to satisfy.
Allocate 10 units to cell BQ and move horizontally again, at BR now remaining supply being 30
and demand being 25 so allocate 25 at BR. Now again move horizontally at BS, with remaining
units of 5 at source B and with demand of 35, allocate 5 units to cell BS. Again, move
horizontally at CP, CQ and CR where no units are left to allocate. So, by default last 30 units will
be allocated at cell CS.

This is the simple method to use but it starts from north west corner irrespective to looking for
the transportation cost, sometimes highest cost may be considered by the method.
Initial Feasible Solution: NWC Method

Source Destination Supply


P Q R S
A 15[20] 18[10] 22 16 30 10
B 15 19[10] 20[25] 14[5] 40 30 5
C 13 16 23 17[30] 30
Demand 20 20 10 25 35 30 100

Calculate Total Cost = (15*20) + (18*10) + (19*10) + (20*25) + (14*5) + (17*30) = 1750 Rs.

3.2.2 Least Cost Method (LCM)

1 First check supply and demand if it is equal, go to step 2 or add dummy row if supply is less
and add dummy column if demand is less.
2 Choose the cell with minimum cost.
3 Consider the supply at source and demand at destination corresponding to that cell and
allocate lower of the two to that cell.
4 Delete the row or column whichever is satisfied by this allocation.
5 If row is deleted, then the column value is revised by subtracting the quantity and column is
deleted then row value is revised.
6 Again, choose the one with least cost from remaining cells, make assignments and adjust row
and column total.
7 Continue until all the units are assigned

Solution

Here supply = demand = 100 so go ahead with step 2. First, select the least cost from whole
matrix which is at cell CP being 13. At CP, supply being 30 and demand being 20 allocate 20
units at CP. Now cut the destination P column as demand has been satisfied. Now again, select
the minimum cost from remaining all values of matrix which at cell BS being 14. At BS, supply
being 40 and demand being 35 allocate 35 units at BS. Now cut the Source C row as supply has
been dispatched fully. Move further, select minimum from remaining values which is at AQ
being 18. Allocate only 10 as demand at Q is only 10 units. Now cut the destination Q column as
demand has been satisfied. Now, out of two remaining values, minimum is 20 at BR, allocate
remaining units of 5 at BR 20 at AR.
Initial Feasible Solution: LCM Method
Source Destination Supply
P Q R S
A 15 18 [10] 22 16 30 20
B 15 19 20 [5] 14[35] 40 5
C 13[20] 16 [10] 23 17 30 10
Demand 20 20 10 25 20 35 100
Calculate Total Cost = (18*10) + (20*5) + (14*35) + (13*20) + (16*10) + (17*30) = 1700 Rs.

3.2.3 Vogel’s Approximation Method (VAM)

1 For each row/column of table, find difference between two lowest costs. (Opportunity
cost/Penalty)
2. Find greatest opportunity cost/Penalty.
3. Assign as many units as possible to lowest cost square in row/column with greatest
opportunity cost.
4. Eliminate row or column which has been completely satisfied.
5. Begin again, omitting eliminated rows/columns. Number of times, the process gets repeated so
it is known as iterative process.

Solution

The highest penalty of 3 occurs at row C, minimum cost in the C row is 13 so intersection of it is
cell CP where allocate 20 units and eliminate column P as demand has been satisfied and 10
units are remaining at source C. Again, repeat step 1and 2 in second iteration (II) with only
remining values of column Q, R and S. The highest penalty at row B, in that minimum cost is 14
so allocate 35 units at cell BS and eliminate column S as demand has been satisfied. Again,
repeat step 1and 2 in third iteration (III) with only remining values of column Q and R. Now,
highest penalty at row C with minimum cost of 16, so allocate 10 units at cell CQ and eliminate
row C as supply has been delivered fully. Still difference can be calculated between remaining
values, so repeat step 1 and 2 in fourth iteration. Highest penalty at row A with minimum cost of
18 so allocate remaining 10 units at cell AQ and eliminate column Q. Now, only one column is
left so no difference can be calculated, hence no iteration is possible, allocate remaining supply
or demand accordingly.

Source Destination Supply


P Q R S I II III IV
A 15 18 [10] 22 [20] 16 30 20 1 2 4 4
B 15 19 20 [5] 14[35] 40 5 1 5 1 1
C 13[20] 16 [10] 23 17 30 10 3 1 7 -
Demand 20 20 10 25 35 100
I 0 2 2 2
II - 2 2 2
III - 2 2 -
IV - 1 2 -

Calculate Total Cost = (18*10) + (22*20) + (20*5) + (14*35) + (13*20) + (16*10) = 1630 Rs.

Note: If there is a tie between two minimum costs, select the one where maximum allocation can
be done. If there is a tie between two least cost as well as maximum allocation, select either of
the two.
Check your progress 1
1.The initial solution of a transportation problem can be obtained by using any of the three
known methods.
However, the only condition is that
(a) the solution be optimal (b)the rim condition are satisfied.
(c) the solution not be degenerate. (d) all of the above.

2. One disadvantage of using North-­‐West Corner Rule to find initial solution to the
transportation problem is that
(a) it is complicated to use.
(b)it leads to degenerate initial solution
(c) it does not take into account cost of transportation.
(d) all of the above.

3. In a transportation problem, number of sources must be same as number of destinations.

4. The method of finding an initial solution based upon opportunity costs is called __________

5. Find with which initial basic feasible solution method the following solution developed, what
is the total cost of transportation?

TO
FROM
P Q R S Supply
A 12[180] 10[150] 12[170] 13 500
B 7 11 8[180] 14[120] 300
C 6 16 11 7[200] 200
Demand 180 150 350 320 1000

3.3 Optimal Solution Method – Modified Distribution Method (MODI


Method)
The modified distribution method, also known as MODI method or u-v method provides a
minimum cost solution to the transportation problem. The steps involved in the Modified
distribution method are as follows:

1)Find out a basic feasible solution of the transportation problem using one of the 'three methods
described in the previous section. Check m + n - 1 = number of occupied cells (where m =
number of rows and n= number of columns) condition to apply MODI method first. For every
step of method, it is compulsory to check above condition.
2) Introduce dual variables corresponding to the row constraints and the column constraints. If
there are in origins and n destinations then there will be m+n dual variables. The dual variables
corresponding to the row constraints are denoted byui (I = 1, 2, ….., m) while the dual variables
corresponding to column constraints are denoted by vj (j=1, 2, …….., n).

3)The values of the dual variables should be determined from the following equations. Values
can be calculated only with the help of occupied cells.
Ui + vj = cij

One of the dual variables can be chosen arbitrarily. It is to be also noted that as the primal
constraints are equations, the dual variables are unrestricted in sign. Any positive or negative
number can be selected but it is always good to allocate zero with no sign. The best way to
assign zero is to select a row or column where maximum number of occupied cells are located.

4) Now find the opportunity costs of each unoccupied cells (The cells where no allocation has
been made) with the help of following formula:
Δij = Cij – (ui + vj)

If any value is negative that means there a scope of reducing transportation cost by that much of
rupees per unit.

5) Repeat the procedure until all values of cij – (ui+vj) ≥ 0.

Let us consider the following transportation problem given in Example 2 with a basic
feasiblesolution computed by least cost method,

Example II

Initial Solution (VAM): Non-optimal Solution


Plant Distribution Centres Supply ui
D1 D2 D3 D4
P1 20 [5] 30 50 17 [2] 7 0
P2 70 35 (+) 40 [7] 60 [3] (-) 10 43
P3 40 12 [8](-) 60 25 [10] (+) 18 8
Demand 5 8 7 15 35
Vj 20 4 -3 17

Total Cost TC = (20*5) + (17*2) + (40*7) + (60*3) + (12*8) + (25*10) = 940 Rs.

Step 1. Initial basic feasible solution by VAM and m + n – 1 = 3 rows + 4 columns – 1 = 6


occupied cells. Here in initial solution there are 6 occupied cells so go to step 2.

Step 2 and 3. The dual variables can be calculated as follows by putting zero in P1 row. (by
considering only occupied cells)
u1 + v1 = 20, u1 + v4 = 17, u2 + v4 = 43, u3 + v4 = 8
u3 + v2 = 4 u2 + v3 = -3,

Step 4. Calculate opportunity costs for each unoccupied cell

Unoccupied Cell Opportunity Cost Δij = Cij – (ui + vj)

P1D2 20 – (0 +20) = 0
P1D3 50 – (0 + (-3)) = 47
P2D1 70 – (43 + 20) = 7
P2D2 35 – (43 + 4) = -12
P3D1 40 – (8 + 20) = 12
P3D3 60 – ( 8 + (-3)) = 55

Cell P2D2 negative value shows that cost reduction is possible by 12 Rs. Per unit.

Closed loop (Shown in non-optimal solution with signs) always starts with selected unoccupied
cell with plus sign. Except beginning cell all other cells are always occupied. Starting sign is
always plus followed by minus, plus, minus so on. and end up wi from P2D2 is shown in above
non-optimal solution. For shifting the units, consider the cells with negative signs and select the
minimum value. Here negative sign cells have allocation of 3 and 8 units so select 3 units to
shift. Shift the 3 units according to the sign, where plus sign add 3, where minus sign subtract 3.
So new solution will be as below:

Plant Distribution Centres Supply ui


D1 D2 D3 D4
P1 20 [5] 30 50 17 [2] 7 0
P2 70 35 [3] 40 [7] 60 10 31
P3 40 12 [5] 60 25 [13] 18 8
Demand 5 8 7 15 35
Vj 20 4 9 17

Now, again check opportunity costs of each unoccupied cell as explained above, if all
opportunity costs are zero or greater than zero then, it is an optimal solution.

Unoccupied Cell Opportunity Cost Δij = Cij – (ui + vj)

P1D2 30 – (0 +4) = 26
P1D3 50 – (0 + 9) = 41
P2D1 70 – (31 + 20) = 19
P2D4 60 – (31 + 17) = 11
P3D1 40 – (8 + 20) = 12
P3D3 60 – ( 8 + 9) = 43

Final transportation schedule is:


P1 to D1 = 5 Units P1 to D4 = 2 Units P2 to D2 = 3 Units
P2 to D3 = 7 Units P3 to D2 = 5 Units P3 to D4 = 13 Units

Total cost TC = (20*5) + (17*2) + (35*3) + (40*7) + (12*5) + (25*13) = 904 Rs.

Check your progress 2


1. State true or false

In a transportation problem, the total demand of destinations must be identical to the total
capacity of sources, otherwise it cannot be solved.

2. In vogel’s approximation method the differences of the smallest and second smallest
costs in each row and column are called ______.

3. The solution to a transportation problem with m-­‐rows and n-­‐columns is feasible if


number of positive allocations are
(a) m + n (b) m x n
(c) m + n - 1 (d) all of the above.

2.4Special Cases of Transportation

3.4.1Unbalanced Transportation Problem

When supply and demand is not equal it is known as unbalanced transportation problem. To
make it balanced, add dummy row with zero cost in each cell if supply is less and add dummy
column with zero cost in each cell if demand is less. Following example will make procedure
clear:

Example III

A B C Supply
X 9 11 10 40
Y 10 8 12 60
Z 12 7 8 50
Demand 50 40 30 120 / 150

Solution

Here supply is 150 and demand is 120, so demand is less by 30 units, as demand is less, we will
add dummy column with D destination with zero transportation costs as actually it does not
contribute in total transportation cost. If supply is less, add dummy row with zero transportation
cost. Solution for the same is shown in table below:
A B C D Supply
X 9 11 10 0
0 40
Y 10 8 12 0 60
0

Z 12 7 8 0 50
Demand 50 40 30 30 150

3.4.2Multiple Optimal Solutions

If opportunity costs of all unoccupied cells are positive, it is an optimal solution. However, when
one of the opportunity costs is zero that means other transportation schedule is possible without
increasing or decreasing total transportation cost. So, the unoccupied cell where opportunity cost
is zero, units can be shifted according to the rule of closed loop and that will be the another
transportation schedule with similar transportation cost.

3.4.3Degeneracy

A basic feasible solution of a transportation problem has m+n-1 basic variables, which means
that the number of occupied cells in such a solution is one less than the number of rows plus the
number of columns, It may happen sometimes that the number of occupied cells is smaller than
m+n-1. Such a solution is called a degenerate solution.
Degeneracy in a transportation problem can figure in two ways:
1) While obtaining Initial feasible Solution
2) While Revising the solution

When a solution is degenerate, the difficulty is that it cannot be tested for optimality.

To overcome degeneracy, an infinitesimally small amount, close to zero to one( or more, as


needed) empty cell and treat the cell as an occupied cell. This is represented by greek letter
ϵ(epsilon) or delta Δ can be used. Some mathematical operations with epsilon

k+ ϵ=k; k- ϵ= k; 0+ ϵ= ϵ;
ϵ + ϵ = ϵ; ϵ- ϵ =0; k* ϵ =0.

It is important to remember that an epsilon cannot be placed in any randomly selected


unoccupied cell.

I.While obtaining Initial feasible Solution


An epsilon is inserted in the least cost independent cell. An independent cell is one from which a
closed loop cannot be traced. It may be further noted that if a given problem requires two (or
more) epsilons, then a cell in which an epsilon has already been placed will be treated as
occupied while determining independence of cells for inserting an epsilon subsequently.

II.While Revising the solution


When the problem becomes degenerate at the solution-revision stage, epsilon (ϵ)is placed in one
(or more, if required) of the recently vacated cells with the minimum cost. And then we proceed
with the problem in the usual manner.

Example IV:
A company wants to ship loads of his product shown below. The matrix shows the kilometers
from sources of supply to the destination.

Shipping cost is Rs. 10/Load per km. what shipping schedule should be used to minimize total
transportation cost?

Solution:
Since total destination requirement of 25 units more than the total resources capacity of 22. This
excess requirement is handled by adding dumny plant Sexcees with a capacity equal to 3 unit. We
use zero transportation cost to the dummy plant.
Then modified total is shown below:

To obtain initial solution:


We use Vogel’s approximation method and get a following solution:
This solution includes m + n – 1 = 5 + – 4 = 8 occupied cells. So, the initial solution is
degenerate.

In order to remove degeneracy, we assign Δ to unoccupied cell (S2, D5) which has minimum cost
among unoccupied cells as shown in table 2.

We use MODI method and therefore first we have to find ui, vj &Δij with following relation.
cij = ui + vj for occupied cell
Δij = cij – (ui + vj) for unoccupied cell.

Here some Δij is not greater or equal to zero. This is not an optimal solution. Then we have to
improve this solution for this we have to choose (Sexcess D3) cell because it has largest negative
cost it must enter the basesThen we choose a closed path for cell (Sexcess D3) which is (Sexcess,
D3)→(Sexcess,D4)→(S2D4)→(S2,D5) →(S1D5)→(S1D3)→ (D4Sexcess)and, min. (Δ,3,5) = Δ. The
new solution is shown in following table 4:

To click its optimality again we have to calculate ci, vj and Δij.


This is shown in following Table 5:

Here again some Δij, is not greater or equal to zero. Then this is not an optimal solution. Then
again we choose (S3D4) cell which is largest negative, it must enter the basis and choose a closed
path as (S3 D4)→(S3D5)→(S1D5)→(S1D3)→(SexcessD3)→(SexcessD4)→(S3D4). Here min (3, 5) = 3
and find a solution which is shown in following Table 6.
Again, we check optimality & calculate ui vj &Δij as follows:
Again (S3D3) < 0 therefore this is not optimal solution again we choose (S3D3) cell enter into
basis and mark a closed path as (S3D3)→(S3D5)→(S1D5)→(S1,D3)→(S3D3) and modified this
table as shown below as Table 8.
Again, we check optimality for this we calculate µi, vj &Δij as follows:

Since all Δij ≥ 0, this is an optimal solution which is shown as follows:

The minimum total transportation cost associated with this solution is

= (4×4)+(4×4)+(2×6)+(3×0)+(3×6)+(1×6)+(8×3)] * 10 (Shipping cost/Load)

= (16+16+12+0+18+6 +24) *10= Rs. 920

2. Degeneracy at Subsequent (Later) Interactions:


To resolve degeneracy which occurs during optimality test, the quantity may be allocated to one
or more cells which have become unoccupied recently to have m + n -1 member of occupied
cells in the new solution.

Example V
Goods have to be transported from sources S1, S2 and S3 to destinations D1, D2 and D3. The
transportation cost per unit capacities of the sources and requirements of the destination are
given in the following table.

Determine a transportation schedule so that cost is minimized

Solution:
To find initial Basic feasible solution. Using north- west corner method. The non-degenerate
initial basic feasible solution is given is Table 1.

Here total occupied cell = m + n – 1= 3 + 3-1= 5

Therefore, there is no degeneracy. To test the optimality. We use MODI method, for this first we
calculate µi, vj &Δij.
Since the unoccupied cell (S3, D1) has the largest negative opportunity cost of the therefore cell
(S3, D1) is entered into the basis. Then we have chosen closed path(S3,D1)→(S3D2)→
(S2D2)→(S2D1)→(S3D3). Here maximum allocation to negative cell is 300.So, modified solution
is given below:

But in this solution degeneracy occur because total no of positive allocation become 4 which is
less than the required no m + n – 1 = 3 + 3 – 1 =5

Hence this is degenerate solution, to remove degeneracy a quantity Δ assigned is to one of the
cells that has become unoccupied so that m + n-1 occupied cell assign Δ to either (S1,D1) or (S3,
D2) and proceed with the usual solution procedure.
Again, proceed with the usual solution procedure. The optimal solution is given as follows: with
total transportation cost = 1900 Rs.

Check your progress 3


1. In the result of QM’s transportation model, if it shows that Source 2 should ship 45 units
to a “dummy” destination, then it means that ___________.

2. _______________ to confirm that the example is having multiple optimal solutions.

3. State true or false


In transportation problem, all special cases cannot occur together

3.5 Let Us Sum Up

Transportation Problem is a special type of linear programming problem. Graphical or Simplex


method is not suitable for the solution of a transportation problem as transportation problem has
a special structure which may be used to develop vital computational techniques for its solution.

In the most general form, a transportation problem has a number of origins and a number of
destinations. A certain amount of a particular shipment is available in each origin. Likewise,
each destination has a certain requirement/demand. The transportation problem indicates the
amount of shipment to be transported from various origins to different destinations so that the
total transportation cost is minimized without violating the availability constraints and the
requirement constraints. The number of techniques is available for computing an initial basic
feasible solution of a transportation problem. These are the North West Corner rule, Least Cost
method and Vogel's Approximation Method (VAM). Optimum solution of a transportation
problem can becalculated from Modified Distribution (MODI) Method. Sometimes the total
available supply at the origins is different from the total demand at the destinations. Such a
transportation problem is said to be unbalanced. An unbalanced transportation problem can be
made balanced by introducing an additional dummy row or column with zero transportation;
cost. The basic feasible solutions of a transportation problem with m origins and n destinations
should have m+n - 1 positive basic variables. However, if basic variables are less than m + n – 1,
the solution is said to be degenerate. A degenerate transportation problem can be modified by
adding n epsilon at independent cell.

3.6 Answers for Check your Progress

Answers to check your progress 1

1. b

2. c

3. False

4. Vogel’s Approximation Method

5. North-west cornet method, TC = 10,220

Answers to check your progress 2

1. True

2. Penalty

3. m + n -1

Answers to check your progress 3

1. Transportation cost is zero for that cell

2.Δij ≥ 0 (Opportunity Costs should be greater than zero)

3. False
3.7 Glossary

The Source/Origin: of a transportation problem is a location from which shipments


aredispatched.
The Destination: The location to which shipments aretransported.
The Unit Transportation Cost: is the cost of transporting one unit of the consignmentfrom an
origin to a destination.
The North West Corner Rule: isamethod of computing a basic feasible solution of
atransportation problem where basic variables are selected from the North West Corner, i.e. top
left corner.
The Least Cost Method: is a method of computing a basic feasible solution ofa transportation
problem where the basic variables are chosen according to the unit cost of transportation.
The Vogel's Approximation Method (VAM): is an iterative procedure of computinga basic
feasible solution of the transportation problem.
The Modified Distribution Method (MODI): is a method of computing optimumsolution
of a transportation problem.

An Unbalanced Transportation Problem: is a transportation problem where the


totalavailability at the origins is different from the total requirement at the destinations.
Multiple Optimal Solutions: More than one unique optimal solutions with same amount of
transportation cost.

ADegenerate Transportation Problem: with in origins and n destinations has a basic feasible
solution with fewer than m+n - 1 positive basic variables.

3.8 Assignment
1. Find an initial basic feasible solution to the following transportation problem. Is it an optimal?
Use VAM &MODI method.

D1 D2 D3 D4 Available Units

O1 5 4 2 1 130

O2 2 3 7 5 100

O3 5 4 5 6 30
Demand 40 50 70 100

2.Mr.ContractorisabuilderandownerofAshianaConstructionCompany.Currentlyhehasthreelargeho
usingprojectsinhand.TheyarelocatedatAndheri,BandraandChinchwad.
HeprocurescementfromfourplantslocatedatDumdum,Ellora,FerozaandGuna.The basic
feasiblesolution asdetermined byNorth WestCorner rule isgiven below:

Projects A B C Availability
Plants
1 2[50] 7 4 50
2 3[20] 3[60] 1 80
3 5 4[30] 7[40] 70
4 1 6 2[140] 140
Demand 70 90 180 340

Mr.Contractorwantstoplanmovementofcementinsuchamannerthattheoptimalminimumtransp
ortation cost is reached. Assisthim.

3. A company has three plants and four warehouses. The supply and demandin units and the
corresponding transportation costs are given. Below table shows initial solution of problem.
Warehouses
I II III IV
Plants Supply
10
1
5 10 4 5 10
20 5
2
6 8 7 2 25
5 10 5
3
4 2 5 7 20
Demand 25 10 15 5 55
Answer the following questions, giving brief reasons:
(a) Is this solution degenerate?
(b) Is this solution optimal?
(c) Does this problem have more than one optimal solution? If so,
show all of them.
4.A company has three plants and three warehouses. The supply and demand in units and the
corresponding transportation costs are given. Below table shows initial solution of problem.
Find an optimal solution.
5.A product is produced by four factories A, B, C, and D. Per unit production costs are Rs.
2, Rs. 3, Rs. 1 and Rs. 5 respectively. The production capacitiesof A, B, C and D are 50,
70, 30 and 50 units respectively. These factoriessupply the product to four stores I, II, III
and IV with a demand of 25, 35, 105,and 20 units respectively. Per unit transportation cost
in rupees are given in thetable below:Determine the extent of deliveries from each of the
factories to each of the stores so that the total cost (production and transportation cost) is
minimum.

Stores
I II III IV
Factory A 2 4 6 11
Factory B 10 8 7 5
Factory C 13 3 9 12
Factory D 4 6 8 3

3.9 Activities

Select any transportation company or manufacturing company. Select 3 or 4 sources and 3 or 4


destinations. Collect the data regarding total supply at each source and total demand at each
destination, each transportation cost to ship from one source to destination, make a proper
transportation structure in form of table and find optimum cost schedule.

3.10 Case Study

XYZ shipping corp. is a leading shipping corporation of the nation. They have
officesinMumbaiandGandhidham.Theyprovideservicesto
differentcompanyandtransporttheirgoodstowarehousesto marketplaces. The following table
provides all necessary information on the availability of supply of each warehouse to the
requirement of the various markets. And the unittransportation cost in thousand Rs from each
warehouse to each market is mentioned below.Mr Sanjay, the shipping clerk of ABG shipping
agency usually prepares schedule of transportation based on his expertise and vast experience.
Mr. Sanjay has worked out the following schedule on the basis of assumptions.12 units from A
to Q, 1 unit from A to R, 9 units from A to S, 15 units from B to R, 7 units from C to P, 1 unit
from C to R

Markets
Warehouse P Q R S Supply
A 6 3 5 4 22
B 5 9 2 7 15
C 5 7 8 6 8
Demand 7 12 17 9 45
a) Being a consultant of the company, check and analyze wetherMr Sanjay has arranged optimal
schedule or not? You can apply transportation method.
b) Find the optimal schedule and minimum total transportation cost whether this problem has
only one optimal solution or not? Justify your answer.

3.11 Further Reading

1. Quantitative Techniques in Management, by N.D. Vora, McGraw hills


2. Operations Research theory and Applications, by J.K. Sharma, Macmillan
3. Operations Research, By Hamdy, A Taha, Pearson Education
4. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
5. Use of software like QM for Windows, Excel Solver
Unit No. 4Assignment
_________________________________
Unit Structure
4.0 Learning Objectives

4.1 Introduction
4.1.1Basic Structure of Assignment

4.2 Assignment – Hungarian Assignment Method


4.2.1 Algorithm of Hungarian Assignment Method
4.2.2 Example of Hungarian Assignment Method
Check your progress 1

4.3 Special Cases of Assignment


4.3.1 Unbalanced Assignment Problem
4.3.2 Prohibited Assignment Problem
4.3.3 Multiple Optimal Solutions
4.3.4 Maximization types of problems
Check your progress 2

4.4 Let Us Sum Up

4.5 Answers for Check your Progress

4.6 Glossary

4.7 Assignment

4.8 Activities

4.9 Case Study

4.10 Further Reading


4.0Learning Objectives
• Understand the concept and assumptions in comprehensive manner
• Learn algorithm of Hungarian assignment method
• Use the algorithm for solving an assignment problem
• Learn special cases of assignment

4.1 Introduction

The assignment problem in the general form can be stated as follows : Given m number of
facilities and n number of jobs and the effectiveness of each facility for each job, the problem is
to assign each facility to one and only one job in such a way that the measure of effectiveness is
optimized (Maximized or Minimized).Several problems of management may havean application
of assignment problem. A project manager may have five people available for assignment and
five projects to fill. He is in interest ofknowing which job should be assigned to which person so
thatall project tasks may be accomplished in the shortest possible time. Likewise, an institute
may have different subjects to offer different faculties, the duty is to assign subjects in a such
way that faculty may be able to complete within short period of time with efficiency. In a
marketing set up by making an estimate of sales performance for different salesmen as well as
for different territories one could assign a particular salesman to a particular territory with a view
to maximize overall sales.It may be noted that with n facilities and n jobs there are n! possible
assignments. One way of finding an optimum assignment is to write all the n! possible
arrangements, evaluate their total cost (in terms of the given measure of effectiveness) and select
the assignment with minimum cost. The method leads to a lengthy computational process.
Henceit is necessary to develop a suitable computation procedure to solve an assignment
problem.

4.1.1 Basic Structure of Assignment

Consider this example to understand basic structure of an assignment. Given the following cost
table for an assignment problem. Here the important condition is to have rows and columns are
same. The assignment should be always one to one. More than one machine cannot be assigned
to more than one jobs.

Operator Machine
A B C D
1 10 2 8 6
2 9 5 11 9
3 12 7 14 14
4 3 1 4 2
4.2Assignment – Hungarian Assignment Method
4.2.1 Algorithm of Hungarian Assignment Method

when the objective function is that of minimization type.

Step 1: Find out the cost table from the given problem. If the number of origins are not equal to
the number of destinations, a dummy origin or destination must be added with zero cost.

Step 2: Find the smallest cost in each row of the cost table. Subtract this smallestcost element
from each element in that row. Therefore, there will be at-least one zero in each row of this new
table, called the first Reduced Cost Table.

Step 3: Find the smallest element ineach column of the reduced cost table. Subtract this smallest
cost element from each element in that column.As a result, each row and column now has at-
least one zero value in the second reduced cost table.

Step 3: Draw the minimum number of horizontal and vertical lines that can cover maximum
zero.

Step 4: Number of drawn horizontal and vertical lines must be equal to number of rows and
columns. If both are same go to step 5 or go to step 6.

Step 5: Start to check with the first row and made first assignment where there is a single zero.
Cut the zero in respective column. Repeat the procedure till assignment is made for all jobs. An
optimal assignment is found, if the number of assigned cells equals the number of rows (and
columns).

Step 6: Examine those elements that are not covered by a line. Choose the smallest of these
elements and subtract this smallest from all the elements that do not have a line through them
and add this smallest element to every element that lies at the intersection of two lines. The
resulting matrix is a new revised cost tableau. Repeat the step until number of rows and columns
are equal to drawn horizontal or vertical lines.

Step 7: Repeat step number 5

Step 8: Calculate total cost or profit with reference to the original matrix.

4.2.2 Example of Hungarian Assignment Method

Example I

let us assume that Geeta is a sorority pledge coordinator with four jobs and only three pledges.
Geeta decides that the assignment problem is appropriate except that she will attempt to
minimize total time instead of money (since the pledges aren’t paid). Geeta also realizes that she
will have to create a dummy fourth pledge and she knows that whatever job gets assigned to that
pledge will not be done (this semester, anyhow). She creates estimates for the respective times
and places them in the following table:E is, of course, a dummy pledge, so her times are all zero.

Job 1 Job 2 Job 3 Job 4


B 4 9 3 8
C 7 8 2 6
D 3 4 5 7
E 0 0 0 0

Solution of Example I
(a) The first step in this algorithm is to develop the opportunity cost table. This is done by
subtracting the smallest number in each row from every value in that row, then, using these
newly created figures, by subtracting the smallest number in each column from every other value
in that column. Whenever these smallest values are zero, the subtraction results in no change.
Job 1 Job 2 Job 3 Job 4
B 1 6 0 5
C 5 6 0 4
D 0 1 2 4
E 0 0 0 0

No change was produced when dealing with the columns since the smallest values were always
the zeros from row four.
(b) The next step is to draw lines through all of the zeros. The lines are to be straight and either
horizontal or vertical. Furthermore, you are to use as few lines as possible. If it requires four of
these lines (four because it is a 4  4 matrix), an optimal assignment is already possible. If it
requires fewer than four lines, another step is required before optimal assignments may be made.
In our example, draw a line through: row four, column three, and either column one or row three.

Job 1 Job 2 Job 3 Job 4


B 1 6 0 5
C 5 6 0 4
D 0 1 2 4
E 0 0 0 0

(c) Since the number of lines required was less than the number of assignees, a third step is
required (as is normally the case). Looking at the version of the matrix with the lines through it,
determine the smallest number not covered by a line. Subtract this smallest number from every
number not covered by a line and add it to every number at the intersection of two lines.
Job 1 Job 2 Job 3 Job 4
B 0 5 0 4
C 4 5 0 3
D 0 1 3 4
E 0 0 1 0

Draw the minimum number of lines to cover all the zeroes, and we have the matrix below.
Job 1 Job 2 Job 3 Job 4
B 0 5 0 4
C 4 5 0 3
D 0 1 3 4
E 0 0 1 0

Since only 3 lines are needed to cover the zeroes, we determine the smallest number not covered
by a line. Subtract this smallest number from every number not covered by a line and add it to
every number at the intersection of two lines. The result is shown with the new lines drawn
through the zeroes.
Job 1 Job 2 Job 3 Job 4
B 0 4 0 3
C 4 4 0 2
D 0 0 3 3
E 1 0 2 0

(d) Since this matrix requires four lines to cover all zeros, we have now reached an optimal
solution stage.
(e) In our example the assignments must be: C to job 3 = 2 , B to job 1 = 4, D to job 2 = 4 and E
to job 4 = 0. Since E is a dummy row, the job labeled job 4 does not get completed. So, the total
time is 10.
Check your progress 1
1. An optimal of an assignment problem can be obtained only if
(a) each row and column has only one zero element
(b) each row and column has at least one zero element
(c) the data are arrangement in a square matrix
(d) non of the above

2. In an assignment problem,
(a) one agent can do parts of several tasks
(b) one task can be done by several agents
(c) each agent is assigned to its own one best task
(d) none of the above

3. Number of drawn lines are not equal to number of rows and columns, eventhogh optimal
solution can be found out. State true or false.

4. The procedure used to solve assignment problems wherein one reduces the original
assignment costs to a table of opportunity costs is called __________.

4.3Special Cases of Assignment

4.3.1 Unbalanced Assignment Problem: When number of rows and columns are not same, it
the case of unbalanced transportation problem. There is a need of adding dummy row or column
whichever is less. In example I, we have added dummy row with zero cost to solve the problem
of unbalanced assignment problem.

4.3.2Prohibited Assignment Problem: When some routes are closed or some


tasks/projects/work cannot be assigned to machine, worker, employee because of any reason,
then it is the case of prohibited assignment problem. There is a restriction for an assignment. To
solve such kind of problems, put “ – or M” where there is a prohibition. And solve the problem
by following algorithm explained for solving example I. Do not do anything with restricted sign,
keep that as it is and solve the problem. Following example will make it clear.

In a production unit four new machines M1, M2, M3 and M4 are to beinstalled in a
machineshop. There are five vacant places A, B, C, D and Eavailable. Because of limited space,
machine M2 cannot be placed at C and M3cannot be placed at A. The cost of locating a machine
at a place in thousandsof rupees is as under:

A B C D E
M1 4 6 10 5 6
M2 7 4 - 5 4
M3 - 6 9 6 2
M4 9 3 7 2 3
4.3.3Multiple Optimal Solutions: When during final assignment there is no single zero found in
any row or column, that can be the case of multiple optimal solutions. It might possible one row
and one column both have two zeros, arbitrarily start with any zero of row or column and find
the assignment, similar way again start with the second zero of row or column. Single zero row
or column assignment will remain same in both the solutions. Following example will make it
clear

Example II

Consider the following assignment problem: The Spicy Spoon restaurant has four payment
counters. There are four persons available for service. The cost of assigning each person to each
counter is given in the following table.Assign one person to one counter to minimize the total
cost.

Person 1 2 3 4
A 1 8 15 22
B 13 18 23 28
C 13 18 23 28
D 19 23 27 31

Solution of Example II

After applying steps 1 to 3 of the Hungarian Method, we obtain the following matrix.

Person 1 2 3 4
A 0 3 6 9
B 0 1 2 3
C 0 1 2 3
D 0 0 0 0

Now by applying the usual procedure, we get the following matrix.

Person 1 2 3 4
A 0 2 5 8
B 0 0 1 2
C 0 0 1 2
D 1 0 0 0

The resulting matrix suggest the alternative optimal solutions as shown in the following

Option 1

Person 1 2 3 4
A 0 2 4 7
B 0 0 0 1
C 0 0 0 1
D 2 1 0 0

Option 2

Person 1 2 3 4
A 0 2 4 7
B 0 0 0 1
C 0 0 0 1
D 2 1 0 0

The persons B & C may be assigned either to job 2 or 3.


The two alternative assignments are:
A1 + B2 + C3 + D4 A1 + B3 + C2+ D4
1 + 23 + 18 + 31 = 73 1 + 18 + 23 + 31 = 73

4.3.4Maximization types of problems: First select the maximum value from whole matrix,
subtract all other values from that maximum value, the new matrix is known as revised cost
matrix. Then apply Hungarian assignment method on developed table as explained in example I.
following example will make it clear.

A company has four sales representatives who are to be assigned to four differentsalesterritories.
The monthly sales increase estimated for each sales representative for different sales
territories(in lakh rupees), is shown in the following table:Suggest optimal assignment and the
total maximum sales increase per month.

Sales Sales Territories


Representatives
I II III IV

A 200 150 170 220

B 160 120 150 140

C 190 195 190 200

D 180 175 160 190


Check your progress 2
1. Assignment cannot be solved with maximization types of problems. State true or false

2. Solve the following assignment problem so as to minimize the time (in days) required to
complete all the task.
person task
T1 T2 T3 T4 T5
A 6 5 8 11 16
B 1 13 16 1 10
C 16 11 8 8 8
D 9 14 12 10 16

4.4 Let Us Sum Up

The Assignment Problem considers the allocation of a number of jobs to a number of persons so
that the total completion time or cost is minimized or total profit is maximized. If the number of
persons is-the same as the number of jobs, the assignment problem is said to be balanced. If the
number of jobs is different from the number of persons the assignment problem is said to be
unbalanced. An unbalanced assignment problem can be converted into a balanced assignment
problem by introducing a dummy person or a dummy job with completion time zero.

Though, an assignment problem can be formulated and solved as a linear programming problem,
it is solved by a special method known as Hungarian Method. If the times of completion or the
costs corresponding to every assignment is written down in a matrix form, it is referred to as a
Cost matrix. The original cost matrix can be reduced to another cost matrix by following the
steps of algorithm. Different cases of the assignment problem are possible. If a person is unable
to carry out a particular job the corresponding- cost or completion time is taken as very large
which automatically prevents such an assignment. The multiple optimal solutions are also
possible with same cost or profit.If the objective is to maximize a performance/profit through
assignment, Hungarian Method can be applied to a revised cost matrix obtained from the original
cost matrix.

4.5Answers for Check your Progress

Answers to check your progress 1

1. b

2. c

3. False

4. Hungarian Assignment Method


Answers to check your progress 2

1. False

2. A – T2 = 5, B – T4 = 1, C – T3 = 8, D – T1 = 9, E – T5 = 0
Total Time = 5 + 1 + 8 + 9 + 0 = 23 Days

4.6 Glossary

Assignment Problem:is a special type of linear programming problem where theobjective isto
minimize the cost or time of completing a number of jobs by a number of persons.

Balanced Assignment Problem:is an assignment problem where the number of personsisequal


to the number of jobs.

Unbalanced Assignment Problem:is an assignment problem where the number of persons is


not equal to the number of jobs.

Hungarian Method:is a technique of solving assignment problems.

A Dummy Job:is an imaginary job with cost or time zero introduced to make
anunbalanced assignment problem balanced

Prohibited Assignment:when theperson/machine is unable to perform a particular job.

4.7 Assignment

1.Solve the followingassignmentproblem. (Assign one machine to oneworkerso thattotal time in


hours isminimized.)

Machine M1 M2 M3 M4 M5

Man

A 3 2 7 4 8
B 5 4 3 8 5
C 3 7 9 1 2
D 4 2 6 5 7
E 2 8 4 6 6

2.Fix-
ItShophasreceivedthreenewrushprojectstorepair:aradio,atoasteroven,andbrokencoffeetable.Three
repairpersons,eachwithdifferenttalentsandabilities,are availabletodothejobs.TheFix-
itShopownerestimatesthatitwillcostinwages
toassigneachoftheworkerstoeachofthethreeprojects.Thecosts,showingthetablediffer
becausetheownerbelievesthateachworkerwilldifferinspeedandskillonthesequite
variedjobs.Theowner’sobjectiveistoassignthethreeprojectstotheworkersinaway thatwill result in
lowest totalcost to theshop.What is theoptimalassignment?

Project
Person 1 2 3
Adams 11 14 6
Brown 8 10 11
Cooper 9 12 7

3. ABC company is engaged in manufacturing 5 brands of packed snacks. It is having five


manufacturing setups, each capable of manufacturing any of its brands one at a time. The cost to
make a brand on the setups vary according to the table below:

S1 S2 S3 S4 S5
B1 4 6 7 5 11
B2 7 3 6 9 5
B3 8 5 4 6 9
B4 9 12 7 11 10
B5 7 5 9 8 11
Find the optimum assignment of products on the setup resulting in the minimum cost.

4. Find an optimal assignment schedule

Job 1 Job 2 Job 3 Job 4


Billy 400 90 60 120
Taylor 650 120 90 180
Mark 480 120 80 180
John 500 110 90 150

4.8 Activitiy

An airline that operates 7 days a week has the timetable as given below. Crewmust have a
minimum layover of 5 hours between flights. Obtain the pairing of flights that minimize layover
time away from home assuming that the crew can be based at either of the two cities. Suggest an
optimum assignment of crew that result in small layover

Delhi – Jaipur -
Jaipur Delhi
Flight No. Depart Arrive Flight No. Depart Arrive
1 7.00 am 8.00 am 101 8.00 am 9;15 am
2 8.00 am 9.00 am 102 8;30 am 9;45 am
3 1.30 pm 2.30 pm 103 12 Noon 1.15 pm
4 6.30 pm 7.30 pm 104 5.30 pm 6.45 pm

4.9 Case Study

MrNanavati is a leading advocate of our country, he employs typists onhourly piece-


ratebasis for daily work. There are five typist and their charges and typing speed are
different.As per the decision of MrNanavati, only one job was given to typist and the typist
waspaid for full hour, even if he worked for the fraction of an hour. Find the most suitable
taskfor each typist and least allocation of the following data with Hungarian methods:

Typist Rate Number of Job/Task Number of


Per pages typed Pages
Hour per hour
A 15 12 P 100
B 16 14 Q 88
C 10 09 R 75
D 12 10 S 150
E 14 11 T 90

4.10 Further Reading

1. Quantitative Techniques in Management, by N.D. Vora, McGraw hills


2. Operations Research theory and Applications, by J.K. Sharma, Macmillan
3. Operations Research, By Hamdy A Taha, Pearson Education
4. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E.
Hanna and T. N. Badri, Pearson Publication
5. Use of software like QM for Windows, Excel Solver
Block Summary

In this block, we discussed widely used techniques of operation research in detail for optimal
output in terms of maximum profit or minimum cost with available resources. In the first unit,
special technique of operations called linear programming problems was explained. Based on the
number of products and resources available, formulation and solution for only two decision
variables were discussed through graphical method. Special problems like infeasibility and
unboundedness of LPP were discussed. In the second unit, the technique of simplex method was
discussed for two or more decision variables. In the third unit, the concept of transportation was
discussed with initial basic feasible solution methods and optimal solution method. Special cases
of transportation like unbalanced, multiple optimal solutions and degeneracy were explained. In
the last unit, the method of assignment was explained to find the effectiveness of assigning jobs
to each facility along with special cases like unbalanced assignment, prohibited, multiple optimal
solutions and maximization types of assignment problems.
Block Assignment

Short Answer Questions

1. Define characteristics of linear programming problem


2. Differentiate between convex and non-convex sets
3. Explain types of constraints in LPP
4. Difference between basic and non-basic variable in simplex method
5. Explain rim condition of transportation.
6. Define prohibited assignment problem

Long Answer Questions


1. Explain special cases of graphical method.
2. Explain degeneracy of transportation problem in detail
3. XYZ Airlines, a small commuter airline in India, has six flight attendants whom it wants
to assign to six monthly flight schedules in a way that will minimize the number of nights
they will be away from their homes. The numbers of nights each attendant must be away
from home with each schedule are given in the following table. Identify the optimal
assignments that will minimize the total number of nights the attendants will be away
from home.

Schedule
Attendant A B C D E F
1 7 4 6 10 5 8
2 4 5 5 12 7 6
3 9 9 11 7 10 8
4 11 6 8 5 9 10
5 5 8 6 10 7 6
6 10 12 11 9 9 10

4. A transportation problem involves the following costs, supply, and demand.

TO
From 1 2 3 4 Supply
1 500 750 300 450 12
2 650 800 400 600 17
3 400 700 500 550 11
Demand 10 10 10 10
(a) Find the initial solution using the northwest corner method, the minimum cell cost
method, and Vogel’s approximation model. Compute total cost for each.
(b) Using the VAM initial solution, find the optimal solution using the modified distribution
method (MODI).

5. Find the graphical solution of the following problem.


Find x and y so as to

Minimize Z = X1 + X2 subject to the following constraints;


X1 + 2X2 ≤ 2000 ,
X1 + X2 ≤ 1500,
X2 ≤ 600 ,
X1 , X2 ≥ 0 .
Block no. 4 Specific Operation Research
Methods
_________________________________

Block Introduction
In this block, some more operation research techniques will be discussed. In the first unit
situation related to planning, scheduling and controlling of projects will be discussed. The
process of developing network diagrams and finding project completion time will be covered. In
the second unit the nature and scope of waiting line concept will be discussed. Some basic
waiting line models and their application will also be covered. In the last unit the concept and
scope of game theory will be discussed. The consequences of interplay of combination of
strategies with competitor and methods employed to derive the optimal strategy will be covered

Block Objective
• Understand situations related to planning , scheduling and controlling of projects
• Develop simple network diagrams with activities.
• Identify the critical path and compute the project completion time
• Compute Slack and float
• Estimate the probability of project completion on a desired date
• Understand the nature and scope of waiting line system
• Describe the characteristics and structure of waiting line system
• Understand the application of statistics in solving waiting line problems
• Apply common waiting line models in suitable business problems
• Determine the optimum parameters of queuing models
• Understand the concept and scope of game theory
• Understand the consequences of interplay of combination of strategies with competitor
• Distinguish between different type of game situations
• Analyse and derive the optimal strategy in a game
• Understand the rule of dominance for solving game problems.

Block Structure
Unit 1 Project Scheduling-PERT/CPM
Unit 2 Waiting Line Models
Unit 3 Game Theory
Unit No. 1 Project Scheduling –CPM/PERT
_________________________________
Unit Structure
1.0 Learning Objectives

1.1Introduction

1.2PERT/CPM network
1.2.1 Key Concepts
1.2.2 Rules for Network Construction
Check your progress 1

1.3Project Scheduling with Certain Activity Times


1.3.1Constructing Network Diagram
1.3.2The concept of Critical path
1.3.3Determination of Earliest Start and Earliest Finish times- Forward Pass
1.3.4Determination of Latest Start and Latest Finish times- Backward Pass
1.3.5Sub topic
Check your progress 2

1.4Project Scheduling with Uncertain Activity Times


1.4.1 Determining the Probability of Completion of the Project by a Desired Date
Check your progress 3

1.5 Let Us Sum Up

1.6 Answers for Check your Progress

1.7Glossary

1.8 Assignment

1.9Activities

1.10Case Study

1.11Further Reading
1.0Learning Objectives

After learning this unit, you will be able to:


• Understand situations related to planning , scheduling and controlling of projects
• Develop simple network diagrams with activities.
• Identify the critical path and compute the project completion time
• Compute Slack and float
• Estimate the probability of project completion on a desired date

1.1Introduction

Network analysis plays an important role in project management. By graphical depiction of


activities and event, the planning, scheduling and control of the project becomes much easier.
Program Evaluation and Review Techniques (PERT) andthe Critical Path Method(CPM)
represent the two well known techniques of network analysis used to assist managers in planning
and controlling projects. These projects are usually very large and complex, involving various
activities or jobs to be done by different departments.Examples of such projects are- construction
of a residential complex, roads, bridges commercial centre, ships, aircraft; development of new
drug/vaccine; installation of a pipeline; satellite development mission; development of new
systems and like. While working on projects, large number of resources in the form of money,
manpower, material and equipment are required. The project managers must schedule and
coordinate the various jobs or activities so that the entire project is completed on time. A
complicating factor in this is the interdependence of activities, for example some activities can
only begin, after completion of other activities. PERT and CPM techniques are extremely helpful
in giving valuable information like-

1. The total time to complete the project


2. The scheduled start and finish dates of each specific activity
3. Activities that are critical and must be completed as scheduled to keep the project on
schedule
4. Non critical activities may be delayed by how much time.
5. Probability of completing the project at a desired date.

1.2 PERT/CPM Networks

Both PERT and CPM techniques use similar terminology and is used for similar purpose;
however they were developed independently of each other, in late 1950’s.. PERT was developed
and used for planning and designing of Polaris Submarine system. The CPM on other hand was
developed by the Du Pont Company and Univac of Remington Rand Corporation as a device to
control the maintenance of chemical plants. The basic difference between the two techniques is-
PERT is useful for project scheduling problems where the completion time of different activities
is not certain and CPM is used in situations where the activity durations are known with
certainty. In CPM not only the amount of time needed to perform various task, but also the
resources required to perform each of the activities are assumed to be known. This technique is
basically concerned with obtaining the trade-off between the project duration and cost. So
variation in project time is inherent with PERT while in CPM it can be systematically varied by
using additional resources. Basically it can be said that PERT is probabilistic in nature and CPM
is deterministic.

Today’s computerized versions of PERT and CPM techniques combine the best features of both
approaches. Thus the distinction between the two techniques is no longer necessary. So in this
unit we will refer project scheduling techniques as PERT/CPM.

1.2.1Key Concepts

Activity: An operation or task which utilizes resources and consumers time is known as an
activity. An activity is represented by a single arrow, also called as arc in the project network.
The head of the arrow shows the sequence or flow in which activities are to be done. The activity
arrow is not scaled and the length of the activity arrow is a matter of convenience and clarity and
is not related to the time required by the activity. All activities should be defined properly, so
that its beginning and end can be identified clearly. A project consists of several activities. For
example, construction of the house involves many activities like- getting finance, building
foundation, Order and receiving materials, building house, selecting paint, selecting furnishings,
painting, finishing work etc.

For eg. An activity Painting

Event: An event is called the beginning and completion of activity. They are points in time and
can be considered as milestones. An event in a network is represented by a circle. The events are
also called as nodes. The difference between activity and even is that an activity is a
recognizable part of the project, involving physical and mental work and requiring time and
resources for its completion, whereas an event is an accomplishment at a point of time which
neither requires time nor consumers resources.

Activity Starts Activity ends

Predecessor Activity: An activity which should be completed immediately prior to the start of
another activity is called Predecessor activity

Successor Activity: An activity which cannot be started, until the completion of one or more
activities is called successor activity

Concurrent Activity: Activities that should be done simultaneously are called concurrent
activity. It should be note that an activity can be predecessor or successor to an activity and may
be concurrent with one or more activity.
Dummy Activity: A dummy activity is an activity, which doesn’t consumer any time or
resource. It is an imaginary activity that does not exist in project activities. A dummy activity is
needed when:
1. Two or more activities in a project have identical immediate predecessor and successor
activities.
2. Two or more activities have some (and not all) of their predecessor activities in common.

Dummy activities are usually shown by arrows with dashed lines. To illustrate, in Fig 1, we have
a situation in which both the activities A and B have the same start and end events .It is incorrect
to represent the activities A and B, as shown in Part (i) because 1-2 is used to represent either A
or B. It is against the rule of assigning unique numbers to activities for the purpose of
identification.
2

1 2
1 3
(i) (ii)
By introducing a dummy activity, the activities A and B can be identified as 1-2 and 1-3
respectively as shown in Part (ii). Thusin situations where two or more activities have the same
beginning and end events, a dummy activity is introduced to resolve the problem

1.2.2Rules of Network Construction

There are number of concepts and rules which should be followed in dealing with activities and
events, when making a network. It helps to develop a correct structure of the network.

1. Each activity is represented by one and only one arrow in the network. Therefore no
single activity can be represented twice in the network.
2. Events are identified by numbers. The number given to an event should be higher than
that is allotted to the event immediately preceding.
3. The activities are identified by the numbers of their starting and ending notes
4. Paralleling activities between two events are prohibited. Thus, no two activities can have
the same start and end events.
5. Before an activity can be undertaken, all activities preceding it must be completed.
6. Dangling must be avoided in a network. It means an event which is not connected to
another event by an activity. An activity is merging into an event, but no activity is
starting or emerging from the event. Thus the event becomes detached from the network
Check your Progress 1
1. PERT stands for program enterprise and resource technique. (True/False)

2. A dummy variable is an activity inserted into the AOA network diagram to show a
precedence relationship, but does not represent any passage of time. (True/False)

3. Unlike PERT, CPM incorporates probabilistic time estimates into the project
management process. (True/False)
4. An activity which should be completed immediately prior to the start of another activity
is
a. Successor activity
b. Predecessor activity
c. Dummy activity
d. Concurrent activity
5. An activity which cannot be started, until the completion of one or more activities is
called successor activity
a. Successor activity
b. Predecessor activity
c. Dummy activity
d. Concurrent activity

1.3Project Scheduling with known activity times


1.3.1Constructing Network Diagram

The first step in PERT/CPM scheduling process is to develop a list of all the activities that
comprise a project and the interdependence relationship. Let us take an example of construction
of a commercial complex. First we need to prepare plan of the complex. Next we may prepare
prospectus and start looking for potential tenants. A contractor should to be selected and building
permits should be prepared and approval should be obtained. Then the construction can be done.
Lastly the contracts can be finalized with tenants and they can move in. In this project, the
various activities required to be performed along with the time needed for execution are given in
Table1.
Table 1: Construction of commercial complex
Activity Description Duration Immediate Predecessor
A Prepare Plan of the commercial complex 5 -
B Develop prospectus for tenants 4 A
C Identify the potential tenants 6 B
D Select contractor 3 A
E Prepare building permits 1 A
F Obtain approval for building permits 4 E
G Perform Construction 14 D,F
H Finalize contracts with tenants 12 B.C
I Tenants move in 4 G,H
Note that this table contains information about immediate predecessors. The immediate
predecessors for a particular activity are those that must be completed immediately before this
activity may start. For example, before we can start on the activity A-Building the Plan of the
commercial complex, at any time as this the first activity. However Activity B can be started
once activity A is completed. Activities B, D, E, can be started, only after completing Activity A.
In the same way rest of the information in the table can be understood.
Once the activities comprising a project and the interdependency relationship among them is
clearly identified, they can be portrayed graphically using a network or an arrow diagram. As
earlier explained, the arrows in a project network represent various activities in a project. Along
with each arrow the description and duration of the activity is represented. The circles at the
beginning and at the end of the arrow represent the nodes or the events.

Activity A has no predecessor activity, as it is the first activity. Let us assume that activity ‘A’
starts at node 1 and ends at node 2. It is represented graphically as below:
A
2

Next activities B, D and E, have a precedence of A, so all the activities will start at the end node
of A. Let us demonstrate:

3
B

A D
2 5
1

Asactivity C has a precedence of B, it will start at node 3. Similarly activity F will start at node
4. However as activity G has a precedence of two events D and F, activity will end on 5.
C
3 6
B

A D
1 2 5

E F
4

Similarly rest of the precedence relationship can be followed and the final network can be
developed. This figure depicts the project network for constructing the commercial complex.

C
3 6

B H

A D G I
1 2 5 7 8

E F

We have earlier discussed the concept of dummy activity. A dummy activity is an imaginary
activity, which does not require any resource or consume time. It is required when: (a)Two or
more activities in a project have identical immediate predecessor and successor activities or(b)
Two or more activities have some (and not all) of their predecessor activities in common. Let us
take an example to understand the use of dummy activity in constructing a network.

Illustration: The table 2 gives the activities involved in construction of a house. Develop a
project network
Table 2 – Construction of a house
Activity Description Duration Immediate Predecessor
A Design House 3 -
B Lay foundation 2 A
C Order and receive materials 1 A
D Build house 6 B,C
E Select paint 1 B,C
F Select furnishings 1 E
G Finish Work 3 D,E

The first activity is A, with no precedence and activity B and C have precedence of A. This can
be represented as:

B 3

A
1 2 4

Both activity D and E, have activity A as predecessor and activities B & C as successor. A
dummy is required when two or more activities have identical immediate predecessor and
successor activities. Hence a dummy is required in this step, which can start either at end of
activity B or C.

A C D
1 2 4 6

Activity F has a precedence of activity E and activity G is preceded by D and F. These


relationships can be represented as given in the final network.

3
B

A C D G
1 2 4 6 7

E F
5
1.3.2 The Concept of Critical Path

To determine the project completion time, we have to analyse the network and identify what is
called the critical path of the network. Let us first understand the concept of a path. A path is
sequence of connected nodes that leads from the start node to finish node. The longest path of the
network is called the critical path. Identifying the critical path of a network is the very important
as it determines the project completion time. If any activity on the critical path is delayed, whole
project will be delayed. There can be multiple critical paths, if there is a tie among the longest
paths. To understand the concept of critical path and project completion let us consider the
earlier example given in Table 1
C(6)
3 6

B(4) H(12)

A(5) D(3) G(14) I(4)


1 2 5 7 8

E(1) F(4)

In the above network, the time estimates are mentioned within bracket along with the activity
name on the arrow. There are three possible paths for this network. For this simple network, the
critical path can be found by enumerating all the possible paths. These paths are listed below:

Path Length
31
(i) A→B→C→H→I
26
(ii) A→D→G→I
(iii) A→E→F→G→I 28
The first path (A→B→C→H→I) is the critical path, as it takes the longest period of time to
complete i.e 31 months. For this network the project completion time will be 31 months. The
activities on the critical path are known as critical activities, as delay in any one of them can
delay the entire project. In other words there is no slack time in the activities on the critical path.
Slack time is the time an activity can be delayed without delayed the project.

For a small network it is simple to list all the possible paths and compare to find the critical path.
As the number of activities increases, the network becomes complex and finding the critical path
by enumerating all path becomes time consuming. Therefore there is a need to develop a
systematic approach to find the critical path. These computations involve a forward and a
backward pass through the network. The forward pass calculation begins, at the start event and
moves to the end event of the project network, i.e. from left to right of the network. The
backward pass calculation begins at the end event and moves to the start event of the network, i.e
from right to left of the network event.

1.3.3 Determination of Earliest start and Earliest Finish Times- Forward pass

The earliest start (ES) time indicates the earliest that a given activity can be scheduled and
earliest finish (EF) times indicates the time which the activity can be completed, at the earliest.
To begin with, each of the activities initiated at the starting node is assumed to start at time ‘0’.
The earliest finish time for each activity is obtained by adding the activity time to the ES time.
The formula of EF is:
𝐸𝐹 = 𝐸𝑆 + 𝑡
𝑤ℎ𝑒𝑟𝑒, 𝑡 𝑖𝑠 𝑡ℎ𝑒 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑡𝑖𝑚𝑒

In our example, activity A is the first activity and therefore will start at ‘0’ time. As the duration
of the activity A is 5 months, so its EF time will be 0+5=5. Now all the subsequent activities are
assumed to start as soon as possible, that is as soon as all of their respective predecessor
activities are completed. For a given activity, the ES would be taken as the maximum of the EF’s
of the activities preceding the activity. For activity B,D and E there is only one predecessor
activity i.e activity A and EF of A is 5, so [ES, EF] of B is [5,9];[ES, EF] of D is [5,8] and[ES,
EF] of E is [5,6] . Similarly for C and F the [ES, EF] are [9,15] and [6,10] respectively. The ES
time of G has to be the maximum of EF’s of the two preceding activities D (EF=8) and
F(EF=10). Therefore the ES of G is 10 and EF is 24 (10+14). The remaining values are
calculated and given in Table 3.

1.3.4 Determination of Late Start and late Finish times- Backward Pass

The concept of the backward pass is to compute the latest allowable times of starting and
finishing, LS and LF for each of the activities of the project. The term ‘ latest allowable “ means
how much an activity can be delayed without delaying the project completion time. The
computations for the backward pass start at the terminal event and move towards the start event.
The terminal node is assigned the latest of EF times of activities merging into it. In our example,
there is only one terminal activity, so the time assigned to node 8 will be 31. This implies that the
latest finish (LF) time of activity I is equal to 31. The formula for Latest start time is:

𝐿𝑆 = 𝐿𝐹 − 𝑡
𝑤ℎ𝑒𝑟𝑒, 𝑡 𝑖𝑠 𝑡ℎ𝑒 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑡𝑖𝑚𝑒

The LS time for the activity being equal to its LF time minus its duration, so for G the LS would
be 31-4=27.In respect of others, the LF time for an activity would be set as equal to the smallest
or minimum of the LS times of its successor activities. The LF time of activities G and H would
be equal to 27, the LS of only succeeding activity I. The latest start and completion times of
activities F, E, D C and B are similarly calculated, as they have one succeeding activity.
However activity A has three succeeding activity- B,D and E. In this case, the minimum of ES
times of these three activities will be taken as the EF of activity A. In our example the ES of
activities B, D and E are all 5, so EF of activity A is 5 and the ES is 0. All the calculated latest
finish times are given in Table 3.

Once the forward pass and backward pass times are computed, it becomes very easy to calculate
the critical path. If the early start and late start or early finish and late finish values are equal,
then the activity is referred as a critical activity. If the values are not equal, the activity is termed
as non critical. The path consisting of critical activities is called a critical path.
1.3.5 Determination of float

The concept of float is of paramount importance to a project manager. Every critical activity in a
network cannot be scheduled later than their earliest schedule time without delaying the project
duration. However, non-critical activity can be scheduled later and allows exercising control over
time, resources, or cost. This flexibility is seen in terms of the float or slack that any activity has.
It is the time available to an activity in addition to its duration. Since each activity has four
associated times, four types of floats can be identified. In practice, only three are used and
discussed here:

Total Float: The total float of an activity represents the amount of time by which it can be
delayed without delaying the project completion date. It is equal to the difference between the
total time available for the performance of an activity and the time required or its performance.
For any activity, the total float is calculated as follows:
𝑇𝑜𝑡𝑎𝑙 𝐹𝑙𝑜𝑎𝑡 = 𝐿𝐹 − 𝐸𝐹
= 𝐿𝑆 − 𝐸𝑆
= 𝐿𝐹 − 𝐸𝑆 − 𝑡
Where t is the activity time
In our example, for activity D,𝑇𝑜𝑡𝑎𝑙 𝐹𝑙𝑜𝑎𝑡 = 𝐿𝐹 − 𝐸𝐹 = 10 − 5 = 5,
= 𝐿𝑆 − 𝐸𝑆 = 13 − 8 = 5

Free Float:The free float is that part of the total float which can be used without affecting the
float of the succeeding activities. The free float is calculated as the earliest start time for the
following activity (j) minus the earliest completion time for this activity (i).

𝐹𝑟𝑒𝑒 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐸𝐹𝑖


In our example, for activity D 𝐹𝑟𝑒𝑒 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐸𝐹𝑖 =10-8=2

Independent Float: The independent float time of an activity is the amount of float time which
can be used without affecting either the head or tail events. The value of independent float is as
follows, if ‘i’ is the preceding activity, ‘j’ is the succeeding activity and ‘t’ is the duration of
activity
𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐿𝐹𝑖 − 𝑡
In our example, for activity D 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐿𝐹𝑖 − 𝑡 = 10 − 5 − 3 = 2

The independent float is always either equal to or less than the free float of an activity. A
negative value of independent float may be obtained, but in that case independent float is taken
as zero. Based on the data given in Table 2, the Earliest and latest times and floats can be
calculated as below:
Table 3: Calculation of Earliest and latest times and float

Activity Duration ES EF LS LF Total Free Independent


float float float
A 5 0 5 0 5 0 0 0
B 4 5 9 5 9 0 0 0
C 6 9 15 9 15 0 0 0
D 3 5 8 10 13 5 2 2
E 1 5 6 8 9 3 0 0
F 4 6 10 9 13 3 0 0*
G 14 10 24 13 27 5 3 0
H 12 15 27 15 27 0 0 0
I 4 27 31 27 31 0 0 0
Note:’*’ the independent float of activity F is actually -3 , but any independent negative float is
to be taken as zero.

Check your progress 2

1. The longest path through a project network is referred to as the C


a. activity-on-node path.
b. path of greatest slack.
c. critical path.
d. noncritical path.
2. In general, the latest finish time for an activity is equal to
a. latest finish time minus the activity time estimate.
b. the minimum of the latest start times for the activities that immediately follow.
c. the maximum of the latest start times for the activities that immediately follow.
d. the average of the latest start times for the activities that immediately follow.
3. For activities on a project’s critical path,
a. earliest start time (ES) equals latest start time (LS).
b. earliest start time (ES)is greater than latest start time (LS).
c. earliest start time (ES) is less than latest start time (LS).
d. earliest start time (ES) equals latest finish time (LF).
4. In general, the earliest finish time for an activity is equal to
a. earliest start time + activity time estimate.
b. earliest start time – activity time estimate.
c. earliest start time – slack time.
d. earliest start time + slack time.
5. Using the data given in table 2, calculate the critical path, project completion time, ES,
EF,LS,LF and floats

1.4 Project Scheduling with Uncertain Activity Times

In the previous section, the critical path and the project length were determined on the basis of
activity times that were assumed to be known and constant. However in reality in most projects
these activity times are unlikely to be predicted correctly. In PERT, we assume that it is not
possible to estimate the time for each activity precisely and instead probabilistic estimates of
time are only possible. This method uses three time estimates for an activity. They are:
• Optimistic Time (a). This is the shortest time the activity can take to complete. It is based
on the assumption that there will not be any difficulty in completing the work
• Most likely time (m) This refers to the time that would normally take to complete an
activity. The most likely time estimate is between the optimistic and pessimistic time
estimate.
• Pessimistic time (b) This is the longest time the activity could take to finish. It assumes
that unexpected problems can occur during the execution of the activity

Depending on the values of a, m, and b, the resulting distribution of activity duration can take a
variety of forms. Typically the activity completion times is assumed to follow beta distribution
as shown in figure 1. The beta distribution is a skewed curve, which can be either positively or
negatively skewed. The below one is a positively skewed curve.

The expected times (te) of various activities is the time estimate based on the weighted arithmetic
mean of a, m and b. It can be calculated as follows:
𝑎 + 4𝑚 + 𝑏
𝑡𝑒 =
6
The variance σ2 of the completion time of an activity is calculated as follows:
2 𝑏−𝑎 2
𝜎 =( )
6
To demonstrate the use of PERT, let us take an illustration. Instead of a single estimate, there are
three time estimates.
Table 4: Three time estimates of activity times
Activity Predecessor Time estimates
Activity Optimistic (a) Most likely( m) Pessimistic (b)
A - 1 4 7
B A 2 6 7
C D 3 4 6
D A 6 12 14
E D 3 6 12
F B,C 6 8 16
G E,F 1 5 6

First let us draw the project network reflecting the precedence relationships:
3
B F

A G
1 2 4 5
C

D E

Next we need to find the expected activity times and variance and then we can apply the
concepts learnt earlier to compute critical path. The calculations of expected times and variance
are shown in given table5

Table5: Calculating expected time and variance

Activity Time estimates Expected Variance


Optimistic (a) Most likely( m) Pessimistic (b) Time
A 1 4 7 4.00 1
B 2 6 7 5.50 0.6944
C 3 4 6 4.17 0.2500
D 6 12 14 11.33 1.7778
E 3 6 12 6.5 2.2500
F 6 8 16 9 2.7778
G 1 5 6 4.50 0.6944

Once the expected times of the activities are obtained, the critical path of the project network is
determined using these time estimates. PERT methodology assumes that the summation of
expected times and variances of the critical activities would yield the expected project duration
and its variance
Path Length
23
(i) A→B→F→G
26.33
(ii) A→D→E→G
(iii) A→D→C→F→G 33
The path (A→D→C→F→G) is the critical path, as it takes the longest period of time to
complete i.e 33 weeks.

1.4.1 Determining the Probability of Completion of the Project by a Desired Date

The management at many times may be interested in knowing probability of completion of a


project at a desired date. Let us assume in our example we are required to complete the project
within 30 weeks. Assuming that the distribution of project completion time te follows a normal
or bell shaped distribution (Central limit theorem). The probability of completion of project by a
target date can be determined using the given formula:

𝑥 − 𝑡𝑒
𝑧=
√∑ 𝜎𝑝2

where x = Desired /Target completion date


te= Expected completion time for the project
∑ 𝜎𝑝2 = Sum of the variances on the critical path

Using the formula, lets compute:


34 − 33 1
𝑧= = = 0.3922
√∑ 1 + 1.778 + 0.25 + 2.778 + 0.6944 2.5495

µ=33 x=34

It is observed from the z table that the probability value of z=0.39 is 0.1517. However as we
studied in unit 4 of block 1, this area is from the mean and we need to find the total shaded area
as shown in the above figure. The desired probability is 0.5+ 0.1517= 0.6517, so we can say
there is 65 % chance of completing the project by the desired time.
Check Your Progress 3
The following table of probabilistic time estimates (in weeks) and activity predecessors
are provided for a project.
Time Estimates (weeks)
Activity A M b Activity
Predecessor
A 3 5 7 --
B 4 8 10 A
C 2 3 5 A
D 6 9 12 B, C
E 5 9 15 D

1. Using given data, the expected time to complete activity A is


a. 5.00 weeks.
b. 7.33 weeks.
c. 7.67 weeks.
d. 8.00 weeks.
2. Using given data, the variance for activity E is
a. 1.291 weeks
b. 1.667 weeks.
c. 2.582 weeks.
d. 2.778 weeks.

3. Using given data, the expected time to complete the project is


a. 34.17 weeks.
b. 33.23 weeks.
c. 31.00 weeks.
d. 21.67 weeks.

4. Using given data, the variance of the project’s total completion time is
a. 5.472 weeks.
b. 5.222 weeks.
c. 4.872 weeks.
d. 3.752 weeks.

5. Using given data, the probability that the project could be completed in 34 weeks or less is
approximately
a. 86 percent.
b. 89 percent.
c. 91 percent.
d. 96 percent.

1.5 Let Us Sum Up


In this unit we discussed how network techniques can be used to plan, schedule and control a
wide variety of projects. The most important aspect of project scheduling is the development of
PERT/CPM project network which depicts the activities and their precedence relationships.
From this project network and activity time estimates, the critical path for the network, the
associated critical activities can be identified. Based on the critical path, project completion time
can be calculated. A network provides information on earliest start and finish times, the latest
start and latest finish times and the float for each activity. The length of the time an activity can
be delayed without affecting the project completion time is known as float. Activity times may
be probabilistic or deterministic. PERT uses three time estimates- optimistic, Most likely and
Pessimistic. The activity times are considered to follow beta distribution. The probability of
completion of a project within a specific time period can be determined by the use of normal
distribution

1.6 Answers for Check Your Progress


Answers to check your progress 1
4. False
5. True
6. False
7. (b)
8. (a)

Answers to check your progress2


3. (c)
4. (b)
5. (a)
6. (a)

Answers to check your progress 3

1. (a)
Solution: t=(3+4*5+7)/6=5.0 weeks
2. (d)
Solution: variance=((15-5)/6)2=2.778 weeks
3. (c)
4. (b)
5. (c)
Solution: Z=(34-31)/2.285=1.31, Pr.=0.91

1.7 Glossary
Program Evaluation and Review Technique (PERT): A network based project scheduling
techniques with uncertain activity times.
Critical path method (CPM): A network based scheduling technique with certain activity
times.
Activities: Specific jobs or tasks that are components of a project.
Immediate Predecessor: The activities that must be completed immediately prior to the start of
a given activity
Project Network: A graphical representation of a project that depicts the activities and shows
the predecessor relationships among the activities
Critical Path: The longest path in a project network.
Earliest Start Time: The earliest time an activity can begin.
Earliest Finish Time: The earliest time an activity can be completed
Latest start Time: The latest time an activity may begin without increasing the project
completion time.
Latest Finish Time: The latest time an activity may be completed without increasing the project
completion time.
Float/Slack: The length of the time an activity can be delayed without affecting the project
completion time
Optimistic Time: The minimum activity time if everything progresses ideally
Most Probable Time: The most probable activity time under normal conditions.
Pessimistic Time: The maximum activity time if significant delays are encountered.
Expected Time: The average activity time
Beta Probability Distribution: A probability distribution used to describe activity times

1.8 Assignment
1.State the rules of constructing a network.
2.What is critical path? State the necessary and sufficient conditions of critical path. Can a
project have multiple paths?
3.Explain the concept of float? Distinguish clearly between free and independent float.
4.A small project consists of seven activities for which relevant data is given below:

Activity Precedence Duration


A - 4
B - 7
C - 6
D A,B 5
E A,B 7
F C,D,E 6
G C,D,E 5

a) Draw the network and find the project completion time


b) Calculate total float for each ofthe activities
5.A project consisting of eight activities has the following characteristics:

Activity Predecessor Time estimates


Activity Optimistic (a) Most likely( m) Pessimistic
(b)
A - 2 4 12
B - 10 12 26
C A 8 9 10
D A 10 15 20
E A 7 7.5 11
F B,C 9 9 9
G D 3 3.5 7
H E,F,G 5 5 5
a) Draw the PERT network
b) Find out the critical path and expected project completion time
c) If a 30 week deadline is imposed, what is the probability that the project will be
completed within the time limit?

1.9 Activities

You are made in charge of planning and coordinating next sales management training program of
your company. List out the activities that, needs to be done to organize the program with
assumed activity times and develop a network.

1.10 Case Study

Food Solutions ltd distributes a variety of food products that are sold through grocery stores and
supermarket outlets. The company receives orders directly from the individual outlets with a
typical order requesting the delivery of several cases of anywhere from20 to 50 different
products. Under the company’s current warehouse operationwarehouse clerks dispatch order
picking personnel to fill each order and have the goods moved to the warehouse shipping area.
Because of the high labour costs and relatively low productivity of hand order picking,
management decided to automate the warehouse operation by installing a computer controlled
order picking system, along with a conveyor system for moving goods from storage to the
warehouse shipping area.

The director of material management has been named the project manager in charge of the
automated warehouse system. After consulting with members of the engineering staff and
warehouse management personnel, the director compiled a list of activities associated with the
project. The optimistic, most probable and pessimistic times have been also seen provided for
each activity.

Activity Description Predecessor Optimistic Most Pessimistic


Probable
A Determine equipment needs - 4 6 8
B Obtain vendor proposals - 6 8 16
C Select vendor A,B 2 4 6
D Order system C 8 10 24
E Design new warehouse layout C 7 10 13
F Design warehouse E 4 6 8
G Design Computer interface C 4 6 20
H Interface computer D,F,G 4 6 8
I Install system D,F 4 6 14
J Train system operators H 3 4 5
K Test System I,J 2 4 6

Develop a report that presents the activity schedule and expected project completion time for the
warehouse expansion project. The top management of Food solutions established a required 40
week completion time for the project. Can this completion be achieved? Include probability
distribution in your discussion. What recommendations do you have if the 40 week completion
time is required?

1.11Further Reading

1. Operations Research, By Hamdy A Taha, Pearson Education


2. Operations Research theory and Applications by J.K. Sharma, Macmillan India Ltd.
3. Quantitative techniques in Management, by N.D. Vora, McGraw hills
4. Quantitative methods for business, by Anderson, Sweeney and Williams, Thompson
5. Quantitative Analysis by Render, Stair, Hanna & Badri, Pearson Education
6. Operations Research by Pradeep Pai, Oxford University Press
Unit No. 2 Waiting Line Models
_________________________________
Unit Structure
2.0 Learning Objectives

2.1 Introduction

2.2 Waiting Line system


2.2.1Arrival process
2.2.2Queue Structure
2.2.3Service System

2. 3 Operating Characteristics of Waiting Line System

2.4 Waiting Line Models

2.5 Single Channel Poisson Arrivals with Exponential Service Times(M/M/1)

2.6 Multiple Channel Poisson Arrivals with Exponential Service Times(M/M/C)

2.7 Single Channel Poisson Arrivals with Arbitrary Service Times(M/G/1)

2.8 Economic Analysis of waiting Lines

2.9 Let Us Sum Up

2.10 Answers for Check your Progress

2.11Glossary

2.12Assignment

2.13Activities

2.14Case Study

2.15Further Reading
2.0Learning Objectives

After learning this unit, you will be able to:


• Understand the nature and scope of waiting line system
• Describe the characteristics and structure of waiting line system
• Understand the application of statistics in solving waiting line problems
• Apply common waiting line models in suitable business problems
• Determine the optimum parameters of queuing models

2.1Introduction

Waiting in line is a common occurrence – in banks, public transportation, restaurant, hospitals,


theatres, workshops, saloons and for several other situations. The waiting line problem is
identified by the random arrival of a group of customers to receive some service. Waiting line
models are developed to help managers understand and take decisions concerning the operation
of waiting lines. In operations research terminology, a waiting line is also known as a queue and
the body of knowledge dealing with waiting lines id known as queuing theoryThe theory of
queuing models has its origin in the work of A.K. Erlang, a Danish telephone engineerduring
early 1900’s.

Waiting Lines are formed when there are more arrivals than what can be handled at the service
facility and no waiting line will be formed if arrivals are less than that. Thus lack of adequate
facility would cause waiting lines of customers to be formed. At times the time required to be
spent in a waiting line by customer is undesirable. The only way the demand in service can be
met is to increase the service capacity or service efficiency to higher level ( if possible). The
service capacity can be build to such a level that the demand at the peak time can be met. But
adding more number of checkout clerks, bank tellers or servers is not always the most
economical strategy for improving service, as the system will remain idle when there are few or
no customers. The managers therefore needs to decide an appropriate level of service which is
neither too low nor too high, so that waiting time can be kept within tolerable limits. The
objective of waiting line models is to provide such information to managers that they are able to
make decisions to balance desirable service levels against the cost of providing the service.

2.2Waiting Line System

The Waiting line system consists essentially of three major elements:


(1) Arrival Process
(2) Service System
(3) Queue structure
Figure 1 Schematic representation of a waiting line system

Queuing System

Input
Source Queue Service
System
Customers
leave
the system
Arrival
Process

2.2.1 Arrival Process

The arrivals from the input populations can be classified on different basis as follows:

Source of arrival:Customer arrivals at a service system may be drawn from afinite or infinite
population. For example all the people of the city can be potential customers for a supermarket.
The number of people being very large, it can be taken as infinite. An infinite population is large
enough in relation to the service system so that the population sixe caused by subtraction or
additions to the population does not significantly affect the system probabilities. However there
are business situations where the population is considered finite. For example, consider a group
of six machines being maintained by one repairman. When one machine breaks down, the source
population is reduced to five and the chance of another machine breaking down is less than when
six machines were operating. The probability of another breakdown is again changed if two
machines are down, with only four operating.

Size of arrival: The customers may arrive for service individually or in groups. Single arrival
are illustrated by customers visiting banks, saloons etc. On the other hand families visiting
restaurants, shipments getting loaded in trucks are example of bulk or batch arrivals.

Arrival Distribution:Defining the arrival process for a waiting line involves determining the
distribution of customer arrival times. The queuing models wherein the number of arrivals in a
given period of time is known with certainty are known as deterministic models. On the other
hand for many waiting line situations the arrivals occur randomly and independently of other
arrivals and we cannot predict when an arrival will occur. In such cases, a frequently employed
assumption is that the Poisson probability distributionprovides a good description of the arrival
pattern.

Degree of Patience: A patient arrival will wait as long as the service facility is ready to serve.
There are two types of impatient arrivals. Members of the first class arrive, view the service
facility and length of the line and then decide to leave. Those in the second class arrive, view,
wait in line and after some time leave. This behavior of first type is known as balking and
second is termed as reneging.
2.2.2 Queue Structure

In queue structure the important thing to know is the queue discipline which means the set of
rules for determining the order of service to customers in a waiting line. The most common
disciplines are:
1. First come First Served(FCFS)
2. Last come first served(LCFS)
3. Service in random order(SIRO)
4. Priority Service/reservations

2.2.3 Service System

There are two aspects to the service system- (1) the structure of the service system (2)
Distribution of service time.

Structure of service system: The structure of a service system means how the service facilities
exist. Waiting line processes are generally classified into four basic structures: Single-channel
single-phase, single-channel multiple-phase, multiple-channel single-phase and multiple-channel
multiple-phase. Channels are the number of parallel servers and phases denote the number of
sequential servers. A bank with a single clerk providing service to a single line of customers is
an example of single-channel single-phase queuing system. If several clerks are providing
service to a single line of customers, it will be an example of multiple-channel single-phase
system. An example of single-channel multiple-phase system is the manufacturing assembly line
type operation in which the product goes through several sequential machines at workstations to
be worked on. If there are two or more assembly lines manufacturing the same product, it is an
example of multiple-channel multiple-phase.

Distribution of Service Time: The service time is the time a customer spends at the service
facility once the service has started. Waiting line formulas generally specify service rate as the
number of units served per unit of time. A constant service time rule states that each service
takes exactly the same time, as in case of automated operations. When service times are random,
they can be approximated by the exponential probability distribution.

Check your progress 1


1. The only way the customers are serviced in queuing situations id the first –come
first serve basis (True /False)
2. An expectation of a long waiting time, particularly when there are limits on time, an
arriving customer may balk.
3. A queuing model where the customer arrivals are at known intervals and the service
time is also certain is
a) Deterministic Model
b) Probabilistic Model
c) Priority Model
d) Multi-server model
2.3Operating Characteristics of Waiting line system

The techniques of waiting line analysis do not provide an optimal or best solution. Instead it
generates certain measures referred to as the operating characteristics that describe the
performance of the queuing system. The management uses these measures to evaluate the system
and take decisions. It is assumed that in long run the performance measures will approach
constant average values, which is referred to as steady state. The following notations are used to
define the basic operating characteristics:

ʎ Average arrival rate (number of customers arriving per unit of time)


µ Mean service rate (number of customers served per unit of time)
n number of customers in the system waiting or
Ls Average number of customers in the system (waiting and being served)
Lq Average number of customers in the queue
Ws Average time a customer spends in the system (waiting and being served)
Wq Average time a customer spends waiting in the queue
P0 Probability of zero customers in the system
Pn Probability of n customers in the system
ƿ Utilization rate; the proportion of time the system is in use

2.4Waiting Line Models


There are numerous waiting line models available. We shall be considering the following models
in this unit:
a) Single Channel Poisson Arrivals with Exponential Service Times (M/M/1)
b) Multiple Channel Poisson Arrivals with Exponential Service Times (M/M/C)
c) Single- Channel with Poisson arrivals and Arbitrary Service Times (M/G/1)

In each of these models the customer arrivals follow Poisson distribution. If the arrivals are
independent with a mean arrival rate of ʎ per period of time, the Poisson probability function
provides the probability of x arrivals in a specific time period as (discussed in detail Block 1,
Unit 3).
𝜆𝑥 𝑒 −𝜆
𝑃(𝑥) =
𝑥!
Where, x= number of arrivals in the time period
ʎ= mean number of arrivals per time period
e= 2.71828

For the first two models, the service times are distributed exponentially. Using exponential
probability distribution, the probability that the service time will be less than or equal to a time of
length t is(discussed in detail Block 1, Unit 4):
𝑃( 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 ≤ 𝑡) = 1 − 𝑒 −𝜇𝑡
Where, µ= mean number of units can be served per unit time period
e=2.71828

Further in each of the models the customer service is assumed to be in first-come- first –served
order (FCFS). Now we will describe each of the models in detail.

Check your progress 2


1. Arrivals of patients to a dentist can be described as arrivals by Poisson process ( True or
False)
2. Service times follows normal distribution ( True or False)

2.5 Single Channel Poisson Arrivals with Exponential Service Times(M/M/1)


This Model is based on following assumptions:
• The arrivals follow Poisson distribution with a mean arrival rate of ʎ
• The service time follow exponential distribution with a mean service rate of µ
• There is only one single service station
• Customers are served on FCFS basis
• Arrivals are from infinite Population

To evaluate a model we need to first check whether a service station can handle the customer
demand of service. If ʎ ≥ µ, the waiting line will increase infinitely and the system will collapse.
For the system to be functional arrival rate should be less that service rate (ʎ< µ).

The following Formulas are used to compute the steady state operating characteristics:
1. Probability that system is busy or probability that a customer has to wait for service :
𝜆
𝜌=
𝜇
Where 𝜌 𝑖𝑠 𝑟ℎ𝑜, 𝑎𝑙𝑠𝑜 𝑘𝑛𝑜𝑤𝑛 𝑎𝑠 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑜𝑟 𝑈𝑡𝑖𝑙𝑖𝑠𝑎𝑡𝑖𝑜𝑛
2. The probability that zero units are in the system or probability that system is idle
𝜆
𝑃0 = 1 − 𝜌 = 1 −
𝜇
3. Probability of exactly n customers in the system::

𝑛
𝜆 2
𝑃𝑛 = 𝜌 𝑃0 = ( ) 𝑃0
𝜇
4. Average/expected number of customers in the system:
𝜆 𝜌
𝐿𝑠 = 𝑜𝑟
𝜇−𝜆 1−𝜌
5. Average/expected number of customers in the queue:
𝜆2 𝜌2
𝐿𝑞 = 𝑜𝑟
𝜇(𝜇 − 𝜆) 1−𝜌
6. Average waiting time in queue:
𝜆 𝜌
𝑊𝑞 = 𝑜𝑟
𝜇(𝜇 − 𝜆) 𝜇−𝜆
7. Average waiting time in system:
1
𝑊𝑠 =
𝜇−𝜆

Illusration1: A bank is considering opening a drive-thruwindow for customer service.


Management estimates that customers will arrive at the rate of 20 per hour. The teller who will
staff the window can service customers at the rate of 30 per hour.
a) What is the expected waiting time in the system per customer?
b) What is the mean number of customers waiting in the system?
c) What is the probability of zero customers in the system?
d) What is the utilization factor?

Here the arrival rate ʎ = 20 customers/hour and µ= 30 customers/hour


Expected waiting time in the system
1 1 1
𝑊𝑠 = = = ℎ𝑜𝑢𝑟 𝑜𝑟 6 𝑚𝑖𝑛𝑢𝑡𝑒𝑠
𝜇−𝜆 30 − 20 10

Mean number of customers waiting in the system


𝜆2 202 4
𝐿𝑞 = = =
𝜇(𝜇 − 𝜆) 30(30 − 20) 3

Probability of zero customers in the system


𝜆 2 1
𝑃0 = 1 − 𝜌 = 1 − = 1 − =
𝜇 3 3
Utilization factor
𝜆 20 2
𝜌= = =
𝜇 30 3

Illustration2: A repairman finds that the time spent on the job has an exponential distribution
with mean 30 minutes. If he repairs machines at an average rate of 10 per 8 hour day, what is the
expected idle time each day? How many jobs are ahead of the set just brought in? What is the
probability that four machines are waiting to get repaired?

Here the arrival rate ʎ = 10 machines/day and mean time of servicing is 30 minutes. It means in
one hour 2 machines and in a day (2x8) =16 machines are repaired. So µ= 16 machine/day.

The probability for the repairman to be idle is:


10 3
𝑃0 = 1 − = = 0.375
16 8
So the expected idle time per day = 3 × 3⁄8 = 3 ℎ𝑜𝑢𝑟𝑠

To determine the number of jobs just brought in, we should be calculating average no of
machines in the system
𝜆 10 5
𝐿𝑠 = = = = 1.67 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠
𝜇 − 𝜆 16 − 10 3
Probability that four machines are waiting means in total there are five machines in the system.
𝜆 2 10 2
𝑃𝑛 = ( ) 𝑃0 = ( ) × 0.375 = 0.1465
𝜇 16

Check your Progress 3


1. In a single server queuing situation, steady state is reached after a sufficiently long period
of time if the service rate is greater than the arrival rate ( True/False)
2. In a Poisson-Exponential single server model, the probability of having at least n
customers in the system is equal to𝜌𝑛 (1 − 𝜌). (True/False)
3. An arrival rate of 10 customers per hour according to Poisson process implies an average
inter-arrival time of
a) 10 minutes
b) 6 minutes
c) 5 minutes
d) 2 minutes

2.6 Multiple Channel Poisson Arrivals with Exponential Service


Times(M/M/C)
This model is based on following assumptions:
• The arrivals follow Poisson distribution with a mean arrival rate of ʎ
• The service time follow exponential distribution
• The service rate µ is same for each channel
• There are K service stations, each of which provide same service
• Arrivals have to wait in single waiting line and move to the first open channel
• Customers are served on FCFS basis
• Arrival rate is less than the combined rate of all K service facilities.

The following Formulas are used to compute the steady state operating characteristics for multiple
–channel waiting lines, where

ʎ= the arrival rate of the system


µ = the service rate of the system
K= is the number of channels
1. The probability that system is idle
1
𝑃0 = 𝑖 𝐾
(𝜆⁄𝜇) (𝜆⁄𝜇) 𝐾𝜇
∑𝑘−1
𝑖=0 + (𝐾𝜇−𝜆)
𝑖! 𝐾!
2. Utilization factor of the entire system :
𝜆
𝜌=
𝐾𝜇
3. Probability of exactly n customers in the system::
𝑛
(𝜆⁄𝜇 )
𝑃𝑛 = 𝑃0 , 𝑤ℎ𝑒𝑛 𝑛 ≤ 𝐾
𝑛!
𝑛
(𝜆⁄𝜇 )
And 𝑃𝑛 = 𝑃0 , 𝑤ℎ𝑒𝑛 𝑛 > 𝐾
𝐾! 𝐾𝑛−𝐾

4. Probability that a customer arriving in the system must wait for service(i.e. all the servers
are busy) is
𝐾
(𝜆⁄𝜇 ) 𝐾𝜇
𝑃𝑤 = [ ]𝑃
𝐾! 𝐾𝜇 − 𝜆 0
5. Average number of customers in the waiting line:
𝐾
(𝜆⁄𝜇 ) 𝜌
𝐿𝑞 = (𝑃0 )
𝐾! (1 − ρ)2
6. Average number of customers in the system:
𝜆
𝐿𝑠 = 𝐿𝑞 +
𝜇
7. Average waiting time in queue:
𝐿𝑞
𝑊𝑞 =
𝜆
8. Average waiting time in system:
1
𝑊𝑠 = 𝑊𝑞 +
𝜇
Illustration 3: The customer care centre of a departmental store help the customers with their
questions or complaints or issues regarding credit card bills. There are chairs placed along the
wall making it a single waiting line. The customers are served by three store representatives, and
customers on a first come first serve basis. The store management wants to analyse this queuing
system as excessive waiting ties can make customers angry enough to shop at other stores. A
study of the customer service department for a 6 month period shows that an average of 10
customers arrive per hour and an average of 4 customers can be served per hour by a customer
care representative.
Here , ʎ = 10 customers/hour
µ= 4 customers/hour
K= 3 customer representatives
Kµ=3x4=12 (>ʎ)

Using the multiple server model formulas, we can compute the following operating
characteristics for the departmental store:

The probability that system is idle or no customers in the service department


1
𝑃0 = 𝑖 𝐾
(𝜆⁄𝜇) (𝜆⁄𝜇) 𝐾𝜇
∑𝑘−1
𝑖=0 + (𝐾𝜇−𝜆)
𝑖! 𝐾!
1
= 2
10⁄ 10⁄ 10⁄ (10⁄4) 3×4
[ 0! 4 + 1! 4 + 4
] + (3×4−10)
2! 3!

= 0.045

Probability that a customer arriving in the system must wait for service(i.e. all the threeservers
are busy) is
𝐾
(𝜆⁄𝜇 ) 𝐾𝜇
𝑃𝑤 = ( )𝑃
𝐾! 𝐾𝜇 − 𝜆 0
3
(10⁄4) 3×4
= ( ) 0.045
3! 3 × 4 − 10

=0.703

Utilization factor of the entire system by


𝜆 10
𝜌= = = 0.833
𝐾𝜇 3 × 4

Average number of customers in the waiting line:


𝐾
(𝜆⁄𝜇 ) 𝜌
𝐿𝑞 = (𝑃0 )
𝐾! (1 − ρ)2
3
(10⁄4) 0.833
= (0.045)
3! (1 − 0.833)2
= 3.5 𝑐𝑢𝑠𝑡𝑜𝑚𝑒𝑟𝑠 𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑖𝑛 𝑑𝑒𝑝𝑎𝑟𝑡𝑚𝑒𝑛𝑡

Average number of customers in the system:


𝜆 10
𝐿𝑠 = 𝐿𝑞 + = 3.5 +
𝜇 4
= 6 𝑐𝑢𝑠𝑡𝑜𝑚𝑒𝑟𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑒𝑝𝑎𝑟𝑡𝑚𝑒𝑛𝑡
Average waiting time in queue:
𝐿𝑞 3.5
𝑊𝑞 = =
𝜆 10
= 0.35 ℎ𝑜𝑢𝑟 𝑜𝑟 21 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑖𝑛 𝑙𝑖𝑛𝑒
Average waiting time in system:
1 1
𝑊𝑠 = 𝑊𝑞 + = 0.35 +
𝜇 4
= 0.60 ℎ𝑜𝑢𝑟 𝑜𝑟 36 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑖𝑛 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 𝑑𝑒𝑝𝑎𝑟𝑡𝑚𝑒𝑛𝑡

The department stores management has observed that customers are frustrated by the waiting
time of 21 minute and the 0.703 probability of waiting. The management is considering
employing an additional service representative to improve the level of service. The operating
characteristics for this system must be recomputed with K=4 service representatives: P0= .073,
Pw=0.31, Ls=3 customers, Ws= 18 minutes, Lq=0.5 customers, Wq= 3 minutes

The waiting time is considerable reduced from 21 minutes to 3 minutes. However, this
improvement in the quality of the service would have to be compared with the cost of adding an
extra service representative, before taking any decision.

Check your Progress 4


1. Arrival rate is should be less than the combined rate of all the service facilities in
M/M/C waiting line Model. ( True or False)
2. If ʎ=5/ hr and µ=2 /hr, for a three server model, the utilisation factor ƿ is
a) 5/2
b) 2/15
c) 2
d) 5/6

2.7 Single- Channel with Poisson arrivals and Arbitrary Service Times(M/G/1)
This model is based on following assumptions:
• The arrivals follow Poisson distribution with a mean arrival rate of ʎ
• The service time has a general probability distribution with a mean service rate of µ and
standard deviation of σ.
• There is a single service station
• A single waiting line is formed
• Customers are served on FCFS basis

The following Formulas are used to compute the steady state operating characteristics for M/G/1
model is, where

ʎ= the arrival rate


µ = the service rate
σ= the standard deviation of the service time
1. The probability that system is idle
𝜆
𝑃0 = 1 −
𝜇
2. Probability that an arriving customer has to wait for service :
𝜆
𝑃𝑤 =
𝜇
3. Average number of customers in the waiting line:
𝜆2 𝜎 2 + (𝜆⁄𝜇 )2
𝐿𝑞 =
2(1 − 𝜆⁄𝜇 )
4. Average number of customers in the system:
𝜆
𝐿𝑠 = 𝐿𝑞 +
𝜇
5. Average waiting time in queue:
𝐿𝑞
𝑊𝑞 =
𝜆
6. Average waiting time in system:
1
𝑊𝑠 = 𝑊𝑞 +
𝜇
Illustration 4: Retail sales at a bank are handled by one clerk. Customer arrivals are random and
the arrival rate is 21 customers per hour. A study of the service process shows that the service
time is 2 minutes per customer, with a standard deviation of σ =12 minutes. Compute the
operating characteristics for M/G/1 model.

Here the arrival rate ʎ= 21/hour or 21/60=0.35 customer per minute (converted to minutes, as
rest of the data is in minutes). The mean service time of 2 minutes shows that the service rate of
the clerk is 1/2 = 0.50 customers per minute.

The operating characteristics of the M/G/1 are computed as follows:

The probability that system is idle


𝜆 0.35
𝑃0 = 1 − =1− = 0.30
𝜇 0.50

Probability that an arriving customer has to wait for service:


𝜆
𝑃𝑤 = = 0.70
𝜇
Average number of customers in the waiting line:
𝜆2 𝜎 2 + (𝜆⁄𝜇 )2
𝐿𝑞 =
2(1 − 𝜆⁄𝜇 )
0.352 × 1.22
= 0.35
= 1.1107 𝑐𝑢𝑠𝑡𝑜𝑚𝑒𝑟𝑠
2 (1 − 0.50)

Average number of customers in the system:


𝜆 1 0.35
𝐿𝑠 = 𝐿𝑞 + = + = 1.8107 𝑐𝑢𝑠𝑡𝑜𝑚𝑒𝑟𝑠
𝜇 1107 0.50
Average waiting time in queue:
𝐿𝑞 1.1107
𝑊𝑞 = = = 3.1733 𝑚𝑖𝑛𝑢𝑡𝑒𝑠
𝜆 0.35
Average waiting time in system:
1 1
𝑊𝑠 = 𝑊𝑞 + 2 = 3.1733 + = 5.1733 𝑚𝑖𝑛𝑢𝑡𝑒𝑠
𝜇 0.50
The manager of the retail sales can review the operating characteristics to decide whether
scheduling a second clerk at the retail sales counter worthwhile.

Check Your Progress 5


1. The service times in M/G/1 waiting line model follow exponential distribution with
arbitrary time.(True/False)
2. To calculate the operating characteristics of M/G/1 model , we require-
a. µ
b. ʎ
c. σ, ʎ and µ
d. ʎ and µ

2.8 Economic Analysis of waiting Line

The information we derive from the operating characteristics of various models can be used to
determine the appropriate level of service. Inadequate service would cause excessive waiting
which has a cost in terms of customer frustration, loss of goodwill, direct cost of idle
machines(machines to be used in production waiting for repair work) etc. On the other hand, high
service level would result in higher set up cost and idle time for service station. Thus the goal of
queuing modeling is the achievement of an economic balance between the cost of providing
service and the cost associated with the waiting time for service. The optimum level of service
would be where the total of waiting time cost and cost of providing service is minimum. Figure 1,
shows that increasing the service level result in increasing the cost of service and reducing the cost
of waiting time.

Figure 1 Cost Relationship in Waiting Line Analysis

The thick curve shows that the total cost decreases to a point and then start increasing. The service
level corresponding to the minimum point on it is the optimum service level.

𝑇𝑜𝑡𝑎𝑙 𝐶𝑜𝑠𝑡 = 𝐶𝑜𝑠𝑡 𝑜𝑓 𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑡𝑖𝑚𝑒(𝐶𝑤 ) + 𝐶𝑜𝑠𝑡 𝑜𝑓 𝑆𝑒𝑟𝑣𝑖𝑐𝑒(𝐶𝑠 )

Illustration 5: A vending machine supplies beverages to a university. Because of rough handling


by students, management has a constant repair problem. The machines breakdown on an average
of three per hour and the breakdowns are Poisson distributed. Downtime costs the company Rs 250
per hour per machine and each maintenance worker gets Rs 160 per hour. One worker can service
machines at an average rate of five per hour; two workers working together can service seven per
hour and a team of the workers can do eight per hour,distributed exponentially. What is the
optimum level of service?
Here
Downtime Cost is Rs 250 per hour per machine,
Repair Cost is Rs 160 per hour per worker

Case I- One worker


ʎ= 3/hour and µ= 5/hour

The average number of machines in the system is


𝜆 3
𝐿𝑠 = = = 1.5 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠
𝜇−𝜆 5−3
Cost of waiting(𝐶𝑤 ) = Down time cost for 1.5 machines = 250 × 1.5 = 375 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Cost of service (𝐶𝑠 ) for one worker is Rs 160/hour
Total cost per hour = 𝐶𝑤 + 𝐶𝑠 = 375 + 160 = 𝑅𝑠 535

Case II- Two worker


ʎ= 3/hour and µ= 7/hour

The average number of machines in the system is


𝜆 3
𝐿𝑠 = = = 0.75 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠
𝜇−𝜆 7−3

Cost of waiting(𝐶𝑤 ) = Down time cost for 0.75 machines = 250 × 0.75 = 187.5 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Cost of service (𝐶𝑠 ) for two worker = 160 × 2 = 320 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Total cost per hour = 𝐶𝑤 + 𝐶𝑠 = 187.5 + 320 = 𝑅𝑠 507.5

Case III- Three worker


ʎ= 3/hour and µ= 8/hour

The average number of machines in the system is


𝜆 3
𝐿𝑠 = = = 0.60 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠
𝜇−𝜆 8−3

Cost of waiting(𝐶𝑤 ) = Down time cost for 0.60 machines = 250 × 0.60 = 150 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Cost of service (𝐶𝑠 ) for three worker = 160 × 3 = 480 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Total cost per hour = 𝐶𝑤 + 𝐶𝑠 = 150 + 480 = 𝑅𝑠 630

Comparing the cost of one, two and three workers, the total cost is lowest in Case II. Hence the
optimal solution is hiring 2 workers.

Check your progress 6


1. 1. The optimum service level is where the total cost is minimum ( True /False)
2. With the increase in service level
a. Cost of waiting increases
b. Cost of waiting decreases
c. Cost of service increases
d. Both (b) and (c)

2.9Let Us Sum Up

Waiting line theory deals with situations where customers arrive, wait for the service, get the
service and leave the system. In this unit we discussed a variety of waiting line models that have
been developed to help managers make better decisions concerning the operation of waiting
lines. The formulas required to compute operating characteristics or performance measures for
each model were presented. The operating characteristics include- Probability that system is idle,
Average number of customers in system, average number of customers in queue, average time a
unit spends in the waiting line, average time a unit spends in system, probability that arriving
customers have to wait for service .

Queuing structure are analyzed for determining the optimum level of service, where the total cost
of providing service and waiting is minimized. An increase in the level of service increases the
cost of providing service but reduces the cost of waiting. While the waiting line models can be
deterministic as well, the probabilistic ones are commonly occurring and analyzed. Three models
discussed in this unit include- Single-channel Poisson- arrival with exponential- service times
(M/M/1), Multiple-channel, Poisson arrivals with exponential service time(M/M/C) and Single –
Channel, Poisson- arrival with arbitrary service times. For a queuing system to be functional the
arrival rate of the customers per unit of time should be less than the service rate.

2.10 Answers for Check your Progress


Answers to check your progress 1
9. False
10. True
11. (a)
Answers to check your progress 2
7. True
8. True
Answers to check your progress 3
3. True
4. True
5. (b)
Answers to check your progress 4
2. True
3. (d)
Answers to check your progress 5
3. False
4. (c)
Answers to check your progress 6
1. True
2. (d)

2.11 Glossary

Queue: A single waiting line that forms in front of service facility


Queuing Theory: The body of Knowledge dealing with waiting lines
Operating Characteristics: The performance measures for a waiting time including the
probability of system being busy, idle, average number of units in the waiting line, the average
waiting time etc
Finite queue: A waiting line that has a limited capacity
Infinite queue: A waiting lie that grows to any length
Balking: The behavior customer to arrive, view the service facility and length of the line and
then decide to leave
Reneging: The behavior of a customer to arrive, view, wait in line and after some time leave
Single Channel waiting line: A waiting line with only one service facility
Arrival Rate: The mean number of customers arriving in a given period of time
Queue discipline: The order in which customers are served.
Service Rate: The mean number of customers that can be served by one service facility in a
given period of time
Multiple -channel waiting Line: A waiting line with two or more parallel service facilities

2.12 Assignment
1. Which assumptions are necessary to employ (M/M/C) waiting Line Model?
2. Discuss the waiting line system in detail with some queuing situations.
3. Describe a single server waiting line mode. Give an example from reallife, for each f the
following queuing models
a. First come first serve
b. Last come last serve
4. The mechanic at Carpoint is able to install new mufflers at an average of three per hour
while customers arrive at an average rate of 2 per hour. Assuming that the conditions for
a single –server infinite population model are all satisfied, calculate the following:
a. Utilisation parameter
b. The average number of customers in the system
c. The average time a customer spends in the queue
d. The probability that there are more than three customers in the system.
5. A service station has five mechanics each of whom can service a scooter in 2 hours on an
average. The scooters are registered at a single counter and then sent for servicing to
different mechanics. Scooters arrive at the service station at an average rate of 2 scooters
per hour. Assuming that arrivals are Poisson distributed and servicing times are
distributed exponentially, determine:
a. The probability that system is idle
b. The probability that there are 3 scooters in the service centre
c. The expected number of scooters waiting in the queue
d. The average waiting time in the queue.

2.13Activities
Analyse the following queuing systems by describing their various system properties:
a) Hospital Emergency Room
b) Traffic light
c) Computer system at university
2.14 Case Study

A fast shop drive in market has one checkout counter where one employee operates the cash
register. The combination of the cash register and the operator is the server in this queuing
system; the customers who line up to pay for the selected items form the waiting line. Customers
arrive at rate of 24 per hour according to a Poisson distribution and service times are
exponentially distributed with a mean rate of 30 customers per hour.

The arrival rate of 24 per hour means that on an average a customer arrives about every 2.5
minutes (60/24). This indicates the store is busy. Because of the nature of the store, customers
purchase few items and expect a quick service. Customers expect to spend more time in a
supermarket where they make larger purchases but they shop at a drive-in market because it is
quicker than a supermarket. Given customer’s expectations, the manager believes that it will be
unacceptable for a customer to wait beyond 5 minutes in the waiting line.

The market manager wants to determine the operating characteristics for this waiting line system
and wants to test if hiring another employee to pack up purchases will help in reducing customer
waiting time and still be economically viable. An extra employee will cost the market manager
$150 per week. With the help of market research agency, the manager had determined that for
each minute that customer waiting time is reduced; the store avoids a loss in sales of $75 per
week. The service rate with two employees will be 40 customers per hour.

2.15 Further Reading

1. Operations Research, By Hamdy A Taha, Pearson Education


2. Operations Research theory and Applications by J.K. Sharma, Macmillan India Ltd.
3. Quantitative techniques in Management, by N.D. Vora, McGraw hills
4. Quantitative methods for business, by Anderson, Sweeney and Williams, Thompson
5. Quantitative Analysis by Render, Stair, Hanna & Badri, Pearson Education
6. Operations Research by Pradeep Pai, Oxford University Press
Unit No. 3 Game Theory
_________________________________
Unit Structure
3.0 Learning Objectives
3.1 Introduction
3.2 Basic Concepts in Game Theory
3.3 Two-person zero-sum game
3.3.1 Payoff Matrix
3.3.2 Maximin Strategy
3.3.3 Minimax Strategy
3.3.4 Saddle Point
3.4 Game with No Saddle point
3.5 Principle of Dominance
3.6 Solution of 2 X n and m X 2 games
3.7 Let Us Sum Up
3.8 Answers for Check your Progress
3.9 Glossary
3.10Assignment
3.11Activities
3.12Case Study
3.13 Further Reading
3.0Learning Objectives

After learning this unit, you will be able to:


• Understand the concept and scope of game theory
• Understand the consequences of interplay of combination of strategies with competitor
• Distinguish between different type of game situations
• Analyse and derive the optimal strategy in a game
• Understand the rule of dominance for solving game problems.

3.1Introduction
The models and techniques we discussed so far in operations research were involving interest of
an organization. For example in transportation problem we are interested in minimization of cost
or maximization of profits given the organizational constraints. However in real life situations,
decision making is often taken where two or more rational opponents are involved under
conditions of competition and conflicting interest. Game theory deals with processes where an
individual or a group or a organization is not in complete control of other player, the opponent
and addresses situations involving conflict, co-operation or both at different levels.

The main objective of the game theory is to determine the rules of rational behavior in the
situations in which the outcomes are dependent on the actions of the interdependent players. A
game is a situation in which two or more players are competing. The players may have different
objectives but their fate is intertwined. They might have some control that will influence the
outcome but they donot have complete control over others .Game Theory is the analysis (or
science) of rational behavior in interactive decision-making. It is therefore distinguished from
individual decision-making situations by the presence of significant interactions with other
‘players’ in the game. Game Theory can be used to help in explaining past events and
situations, predict what actions players will take in future games, and based on it take decisions
in interactions with other players to achieve the best outcome.

3.2 Basic Concepts in Game Theory

Game theory models can be classified on the basis of factors like number of players involved,
sum of the gains or losses and the number of strategies employed.

If there are two participants in a game it is called two-person game and if more than two
participants are involved, it is a n-person game. In a game, if the sum of the gains and losses is
equal to zero, it is called zero- sum or constant-sum game. If the sum of the gains and losses is
not equal to zero, it is called non-zero-sum game. A game is said to be finite if each player has
the option of choosing from only a finite number of strategies, or else it is called infinite.

Some of the key concepts to be used in game theory are described below:

Players: The competitors or decision makers in a game are called the players of the game.
Strategies: The alternative courses of action available to a player are called as strategies

Payoff: The outcome of playing a game is called the payoff to the concerned player.

Optimal Strategy: A strategy in which the player can achieve the maximum payoff is called the
optimal strategy.

Payoff Matrix: The tabular display of the payoffs of the players under various alternatives is
called the payoff matrix.

Pure strategy: A game solution that provide a single best strategy for each player.

Mixed strategy: If there is no one specific strategy as the best strategy for any player in a game,
then the game is referred to as mixed strategy or a mixed game. Each player has to choose
different alternative courses of action from time to time.

Check your Progress 1


1. In a two person game, both the players must have an equal number of
3.2…
strategies(True/False)
2. The zero-sum game implies that any gain of one player is exactly matched by a loss to
the other, so that their sum is equal to zero. ( True/ False)
3. Game theory is concerned with
a) Predicting the results of bets
b) Choice of an optimal strategy in conflict situations
c) Utility maximization by firms
d) Migration pattern in India
4. In game theory in which one firm can gain only what another form losses is called
a) A non zero-sum game
b) Two-person game
c) Prisoners dilemma
d) Zero-sum game

3.3Two-Person Zero-sum Games


A Two- person zero-sum game is the one which involves two persons that any gain of one player
is exactly matched by a loss to the other, so that their sum is equal to zero. Suppose there are two
companies A and B in a region selling a competing product and fighting for a larger market
share. With the total market of a given size, any share of the market gained by one player will be
lost by other and therefore the sum of the gains and losses equals zero.

3.3.1Payoff Matrix:

When players select particular strategies, the payoff can be represented in the form of a payoff
matrix. Suppose firm A has m strategies and firm B has n strategies, a payoff matrix will be
Player B’s Strategies
B1 B2……Bn
A1 a11 a12 a1n
A2 a21 a22 a2n
Player A’s Strategies
. . . .
. . . .
Am am1 am2 amn

The matrix is in terms of player A’s point of view. Player A wishes to gain as large a payoff a ijas
possible, while player will do his best make it as small a value of aij as possible.

Let us assume that both the firms A and B are considering three strategies to gain the market
share- advertising, promotion and quality improvement. The strategies of advertising , promotion
and better quality is represented as A1,A2 and A3 respectively for firm A and B1, B2 and B3
respectively for firm B. As shown below in matrix, in total there are 3x3=9 combinations of
moves. Each pair of moves shall affect the share of market in a particular way. As the payoff is
in terms of A- a positive payoff indicates that A had gained at the expense of firm B while
negative pay-offs imply B’s gain at A’s expense. For example, strategy of advertising by both
firm A and B will lead to 12 % market share gain for firm A, while advertising by A and
promotion by B, would lead to a shift of 8 % market share in favour of B. Similarly there are
pay-offs corresponding to other pairs of moves.

B’s Strategy
B1 B2 B3
A’s Strategy A1 12 -7 -2
A2 6 7 3
A3 -10 -5 2

3.3.2 Maximin Strategy

The conservative approach in selection of best strategy would call for assuming the worst to
happen and act accordingly. In reference to the pay off matrix, if firm A employs A1 strategy it
would expect the firm B to employ strategy B2, thereby reducing A’s payoffs from the strategy
A1 to its minimum value of -7, representing a loss to firm A. If the firm employs A2 strategy, it
would expect the firm B to employ B3 strategy which would give a three percent gain in market
share. Similarly for strategy A3, it will expect Firm B to employ B1 strategy, with a loss of 10
percent. The firm A would like to make the best of the situation by choosing the strategy which
gives maximum of these minimum pay-offs. Since the minimal payoff to strategies A1,A2 and A3
are -8, 3 and -10 respectively; firm A would select A2 as its strategy. This decision rule is called
the Maximin Strategy.

3.3.3 Minimax Stratgey

Firm B would also employ a similar conservative approach. When B employs B1 strategy, it
expect firm A to employ A1, which gives maximum gain to A. In a similar way, adoption of
B2and B3, would make it expect firm A to adopt strategy A2. To minimize the gain of the
competing firm, firm B would select the strategy which would yield the least gain to firm A .
This decision rule of firm B is called Minimax strategy.

3.3.4 Saddle point:

As discussed above, it is clear that maximin strategy A2 of firm and the minimax strategy of firm
B, both lead to the same payoff. These strategies are based on the conservative approach of
choosing the best strategy,by assuming that the worst will happen. By adopting the maximin
strategy A can stop B from lowering its gain in the market share below 3 percent and by adopting
minimax strategy firm B can stop A from gaining more than 3 percent market share. The
situation is therefore, one of equilibrium. The point of equilibrium is known as the saddle point.

To obtain the saddle point, if it exists, we determine the minimum payoff value for each row and
maximum pay off value for each column. If maximum of row minima is equal to the smallest of
the column maxima, then it represents the saddle point. For the illustration, lets continue with the
same problem:

B’s Strategy Row Minima


B1 B2 B3
A’s Strategy A1 12 -7 -2 -7
A2 6 7 3 3*
A3 -10 -5 2 -10
Column Maxima 12 7 3*

Here 3 represent the saddle point.

It is also possible to have more than one saddle points for a given problem. For example consider
the following Matrix
B’s Strategy Row Minima
B1 B2 B3 B4
A’s Strategy A1 2 15 13 -14 -14
A2 -5 6 -4 -5 -5*
A3 5 -2 0 -5 -5*
Column Maxima 5 15 13 -5*

In relation of B’s minimax strategy, A form could employ either A1 or A2, each of which
represents the maximin strategy for it. As the pay-off corresponding to B’s minimax strategy and
A’s either maximin strategies is identical, there are two saddle points, represented by A2B4 and
A3B4. The value of the game is -5, a net loss of 5 point to A and an equivalent profit of B

Illustration 1: Soul Ltd had forecasted sales for its products and products of competitors, Pure
Ltd. There are four strategies for soul Ltd- S1, S2, S3, S4 and three strategies available to pure
Ltd- P1, P2, P3. The payoffs to all the twelve combinations are given below. Considering the
information, state what would be the optimal strategy for Soul Ltd? Pure Ltd ? What is the value
of the game? Is the game fair?
Pure’s Strategy
P1 P2 P3
S1 30000 -21000 1000
S2 18000 14000 12000
Soul’s Strategy S3 -6000 28000 4000
S4 18000 6000 2000

For determining the optimal strategies, we should examine if saddle point exists for the given
problem:

Pure’s Strategy Row Minima


P1 P2 P3
S1 30000 -21000 1000 -21000
Soul’s Strategy S2 18000 14000 12000 12000*
S3 -6000 28000 4000 -6000
S4 18000 6000 2000 2000
Column Maxima 30000 28000 12000*

Here the saddle point exists at S2P3. The optimal strategy for Soul’d Ltd is S2 and for Pure Ltd is
P3 respectively. The value of the game is V=12000, a gain of 12000 to Soul’s Ltd. Since V≠ 0, it
is not a fair game.

Check Your Progress 2


1. In game theory, the outcome or consequence of a strategy is referred to as the:
a) Payoff
b) Penalty
c) Reward
d) End game strategy
2. The saddle point in a pay-off matrix is always
a) Largest value in matrix
b) Smallest no in its column and smallest no in its row
c) Largest no in its column and smallest no in its row
d) Smallest number in the matrix

3.4Game with No Saddle Point

It is possible that there is no saddle point of a game and hence it is not possible to find solution in
terms of pure strategies- the maximin and minimax rule. To solve such problems we need to
employ mixed strategies. A mixed strategy represents a combination of two or more strategies
that are selected one at a time, with pre-determined probabilities. Therefore in mixed strategy, a
player decides to choose among various alternatives in a certain ratio.

Illustrration2: The following is the pay-off matrix of a game being played by A and B.
Determine the optimal strategies for the players and the value of the game.

B’s Strategy Row Minima


B1 B2
A’s Strategy A1 9 -6 -7
A2 -5 5 -6
Column Maxima 9 4

As can been seen from the table, the maximin value is not equal to the minimax value, implying
there is no saddle point in this problem.

With mixed strategies, let the player A employs A1 strategy with a probability of x and A2
strategy with a probability of (1-x) . If B plays strategy B1, the A’s expected payoff can be
determined from the first column of the pay-off matrix as follows:

𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑝𝑎𝑦 − 𝑜𝑓𝑓 = 9𝑥 − 5(1 − 𝑥)

Similarly, if B plays strategy B2, the expected payoff of A can be determined as follows:

𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑝𝑎𝑦 − 𝑜𝑓𝑓 = −6𝑥 + 5(1 − 𝑥)

We shall find a value of x so that the expected payoff for A is the same irrespective of the
strategy adopted by B. This can be obtained by equating the two equations and solving it:
9𝑥 − 5(1 − 𝑥) = −6𝑥 + 5(1 − 𝑥)
or 9𝑥 − 5 + 5𝑥 = −6𝑥 + 5 − 5𝑥
or 25 𝑥 = 10
or 𝑥 = 10 /25 = 2/5

A will do best by choosing A1 and A2 strategy in the proportion 2:3 ( i.e A1 2/5 times & A2 3/5)
The expected pay-off for A applying mixed strategy is :
2 2
9𝑥 − 5(1 − 𝑥) = 9 × − 5 (1 − ) = 3/5
5 5
or
2 2
−6𝑥 + 5(1 − 𝑥) = −6 × + 5 (1 − ) = 3/5
5 5
Thus Firm A will have a net gain of 3/5 in long run.

We can determine the mixed strategy of B in a similar way. Thus if player B plays B1 with a
probability of y and B2 with a probability of (1-y), then

𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑝𝑎𝑦 − 𝑜𝑓𝑓 ( 𝑔𝑖𝑣𝑒𝑛 𝑡ℎ𝑎𝑡 𝐴 𝑝𝑙𝑎𝑦𝑠 𝐴1 ) = 9𝑦 − 6(1 − 𝑦)


𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑝𝑎𝑦 − 𝑜𝑓𝑓 ( 𝑔𝑖𝑣𝑒𝑛 𝑡ℎ𝑎𝑡 𝐴 𝑝𝑙𝑎𝑦𝑠 𝐴2 ) = −5𝑦 + 5(1 − 𝑦)

We can determine the value of y, as follows


9𝑦 − 6(1 − 𝑦) = −5𝑦 + 5(1 − 𝑦)
or 9𝑦 − 6 + 6𝑦 = −5𝑦 + 5 − 5𝑦
or 25 𝑦 = 11
or 𝑦 = 11/25

Thus B would play strategies B1 and B2 in the ratio 11:14 in a random manner.
The expected pay-off for B applying mixed strategy is:
11 11
9𝑦 − 6(1 − 𝑦) = 9 × − 6 (1 − ) = −3/5
25 25
or
11 11
−5𝑦 + 5(1 − 𝑦) = −5 × + 5 (1 − ) = −3/5
25 25

Thus Firm B will have a net loss of 3/5 in long run.

Thus, we conclude that A and B should both use mixed strategies as given below and the value
of the game in long run is 3/5

Strategy Probability
For A, A1 2/5
A2 3/5
For B, B1 11/25
B2 14/25

In general, for a zero sum two-person game , in which each of the players A and B has strategies
A1, A2 and B1 B2 respectively and the payoffs are given below, if x is the probability of player A
choosing strategy A1 and y is the probability of player B choosing strategy B1:

B’s Strategy
B1 B2
A’s Strategy A1 A11 A12
A2 A21 A22

Then,
𝐴22 − 𝐴21
𝑥=
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 )

𝐴22 − 𝐴12
𝑦=
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 )

𝐴11 𝐴22 − 𝐴12 𝐴21


𝑉=
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 )

By substituting the values in the above equation, we get matching the already obtained ones:

5+5 10 2
𝑥= = =
(9 + 5)— (−6 − 5) 25 5
5 − (−6) 11
𝑦= =
(9 + 5)— (−6 − 5) 25

9 × 5 − (−5)(−6) 15 3
𝑉= = =
(9 + 5)— (−6 − 5) 25 5

Check Your Progress 3


1. In a mixed strategy, each player should optimize
a) Maximum pay-off
b) Minimum loss
c) Maximum gain
d) Expected gain
2. Consider the following Two-person game, what percentage of time Player y will
employ Y1 strategy?
Y1 Y2
X1 6 3
X2 2 8
a) 1/3
b) 2/3
c) 4/9
d) 5/9

3.5Principle of Dominance

Sometimes, a strategy available to a player is found better to some other strategy/strategies. Such
a strategy is known to dominate the others. This concept is useful in simplifying the games and
finding solution to a game problem. Consider the following example:
B’s Strategy
B1 B2 B3
A’s Strategy A1 0 -1 2
A2 5 4 -3
A3 2 3 -4

Lets us follow the usual procedure for identifying a pure strategy, we compute the row minima
and column maxima as below:
B’s Strategy Row Minima
B1 B2 B3
A’s Strategy A1 0 -1 2 -1*
A2 5 4 -3 -3
A3 2 3 -4 -4
Column Maxima 5 4 2*
The maximum of row minima is -1 and the minimum of column maxima is 2. As the maximin
and minimax values are not equal, the two person zero sum game does not have an optimal pure
strategy. For a problem larger than 2 X 2 matrix, we cannot apply the mixed strategy
probabilities using algebraic equation, as we did in the previous section.

If the game is larger than 2 X 2 requires a mixed strategy, we need to reduce the size of the
matrix by looking for dominated strategies. A dominant strategy exists if another strategy is at
least as good regardless of what opponent does. For example, for strategies A1 and A2 in the
column B1, 5>2, in column B2, 4>3 and in the column B3, -3>-4. Thus regardless of what the
player B does, player A will always choose higher values of strategyA2 as compared to A3.
Therefore we can say strategy A2 dominates strategy A3, and A3 strategy can be dropped from
consideration of player A. This helps us to reduce the size of the game. After eliminating, the
game becomes:
B’s Strategy
B1 B2 B3
A’s Strategy A1 0 -1 2
A2 5 4 -3

Now if we compare A1 and A2, we cannot find dominated strategy. Next we look for dominating
strategies for player B. We should remember that player B looks for smaller values as the matrix
is in terms of A’s payoff. By comparing B1 and B2 strategies, in row A1, -1<0 , in row A2; 4<5.
Thus regardless of what player A does, Player B would always prefer the smaller values of
strategy B2 over strategy B1. Therefore B1 is dominated by strategy B2 and hence is eliminated.

B’s Strategy
B2 B3
A’s Strategy A1 -1 2
A2 4 -3

By successively eliminating dominated strategies, we reduce the game to a 2 X 2 game. The


algebraic solution procedure described in the earlier section can now be used to find the optimal
probabilities for a mixed strategy problem.

The 2 X 2 matrix can be solved using the algebraic method as


𝐴22 − 𝐴21 −3 − 4 −7 7
𝑥= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (−1 − 3) − (2 + 4) −10 10

𝐴22 − 𝐴12 −3 − 2 −5 1
𝑦= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (−1 − 3) − (2 + 4) −10 2

𝐴11 𝐴22 − 𝐴12 𝐴21 −1 × −3 − 2 × 4 −5 1


𝑉= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (−1 − 3) − (2 + 4) −10 2

Thus, we conclude that A and B should both use mixed strategies as given below and the value
of the game in long run is 1/2
Strategy Probability
For A, A1 7/10
A2 3/10
For B, B1 1/2
B2 1/2

There are problems, where after applying dominance rule, game is reduced to a 2 X n or a m X 2
matrix. In such case, the problems can be solved graphically.

Check your Progress 4

1. A strategy that a best regardless of what rival players do is called


a) First mover advantage
b) Dominant strategy
c) Minimax Strategy
d) Maximin strategy

2. Given the following two person game, which strategy can be eliminated by
dominance rule
Y1 Y2
X1 9 13
X2 12 8
X3 6 14
a) X1
b) X2
c) X3
d) None of the above

3.6 Solution of 2 X n and m X 2 games

When a player A has only 2 strategies to choose from and the player B has n, the game is of the
order 2 X n, whereas in case B has only two strategies and A has m strategies, the game is a m X
2 game.

The problem may be originally a 2 X n or a m X 2 game or might have been reduced to such size
after applying the dominance rule. By using the graphical method, the aim is to reduce the game
to the order of 2 X 2 by identifying and eliminating the dominated strategies and then solve by
the algebraic method as used earlier. The game value and optimal strategy can be read from the
graph, but generally algebraic method is adopted to get the answer.

Let us consider the following game using graphical approach.


Player B’s Strategy
B1 B2
A1 6 -7
Player A’s Strategy A2 1 3
A3 3 1
A4 5 -1

Here, the payoff matrix consists of m rows and 2 columns. We will be discussing how to solve a
m X 2 game. The first step is to check whether the problem have a saddle point or not. As can be
seen below, this game has no saddle point

Player B’s Strategy Row Minima


B1 B2
A1 6 -7 -7
Player A’s Strategy A2 1 3 1*
A3 3 1 1*
A4 5 -1 -1
Column Maxima 6 3*

Next we try to simplify the matrix by applying dominance rule. In this problem the dominance
strategy cannot be applied and so we cannot simplify the matrix any further

Let y be the probability that player B select B1 strategy and (1-y) be the probability that player B
selects B2 strategy. When player A chooses to play A1, the expected payoff for B shall be 6y-
7(1-y) = 13y-7. Similarly the expected payoff of strategies A2, A3, and A4 are found and is
shown in below table. To plot graphically, the value of pay-off when y=0 and y=1 is also
calculated for each of the strategies.

Player A’s Expected Pay-off Payoff of B


Strategy y=0 y=1
A1 6y-7(1-y)=13y-7 -7 6
A2 y+3(1-y)= -2y+3 3 1
A3 3y+1(1-y)=2y+1 1 3
A4 5y-1(1-y)= 6y-1 -1 5
Plot the pay-off values by keeping using appropriate scaling to represent ‘y’ on x axis and pay-
off values on y axis. This is shown in below figure:
The lines are marked A1, A2, A3 and A4; they represent the respective strategies. For each value
of y, the height of the lines at that point denotes the pay-offs of each of B’s strategies against(y,
1-y) for B. Player B is concerned with his maximum pay-off when he plays a particular strategy
which is represented by the uppermost area formed by the four lines and wishes to choose y so as
to minimize this maximum pay-off. The lowest interaction point in the upper boundary of the
graph is the minimax point for Player B. ABCD is the upper boundary of the graph and the
lowest point is B, so this is called the minimax point. As can be seen from the graph more than
two lines are passing through the point B. We will be selecting any two lines with opposite
slopes. So either A2 and A3 or A2 and A4 can be selected. Here we select A2 and A4. The
reduced pay-off matrix will be as follows:

Player B’s Strategy


B1 B2
A2 1 3
Player A’s Strategy A4 5 -1

The 2 X 2 matrix can be solved using the algebraic method as


𝐴22 − 𝐴21 −1 − 5 −6 3
𝑥= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (1 − 1) − (3 + 5) −8 4

𝐴22 − 𝐴12 −1 − 3 −4 1
𝑦= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (1 − 1) − (3 + 5) −8 2

𝐴11 𝐴22 − 𝐴12 𝐴21 1 × −1 − 3 × 5 −16


𝑉= = = =2
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (1 − 1) − (3 + 5) −8
Thus, we conclude that A and B should both use mixed strategies as given below and the value
of the game in long run is 2

Strategy Probability
For A, A1 0
A2 3/4
A3 0
A4 1/4
For B, B1 1/2
B2 1/2

If it is a 2 X n game, the expected pay-off values will be calculated for Player A, x will be the
probability of choosing A1 strategy and (1-x) will be the probability of choosing strategy A2.
The x axis will be used to represent ‘x’ values and payoff of player A will be y axis.The highest
interaction point in the lower boundary of the graph is the maximin point for Player A.

Summary of the Steps for solving Two- person Zero –sum games
1. Use the maximin strategy for player A and minimax strategy for player B to determine
whether a pure strategy solution exists. If there is a saddle point, it is the optimal solution.
2. If a pure strategy does not exist and the game is larger than 2 X 2 , identify a dominated
strategy to remove a row or a column. Develop a reduced pay-off table and continue to
use dominance rule to reduce as many rows and columns as possible.
3. If reduced game is 2 X n or m X 2, solve graphically to reduce it to a 2 X 2 matrix
4. If the reduced game is 2 X 2, solve for the optimal mixed strategy probabilities using
algebraic method.

If the game cannot be reduced to a 2 X 2 game, a linear programming model is used to solve
for the optimal mixed strategy probabilities, which is beyond the scope of this unit.

Check your Progress 5


1. In a 2 X n game , the graphical solution is obtained from which value of lower
boundary
a) Highest value
b) Lowest value
c) Average value
d) None of the above
2. If in a m X 2 game, Player A has m strategies and player B has two strategies, the y
axis is used to represent
a) x values
b) y values
c) Expected payoff for B
d) Expected payoff for A
3.7 Let us sum up

In this unit, we described how to solve two-person zero-sum games. In these games, the two
players end up with sum of the gain (loss) of one player and the loss (gain) to the other player is
always equal to zero. The steps that are used to determine whether a two-person zero-sum game
results in an optimal pure strategy were discussed. If a pure strategy exists, a saddle point
determines the value of the game. If an optimal strategy does not exists for a two-person zero-
sum 2 x 2 game, the algebraic method was used to derive the probabilities of mixed strategy. In
mixed strategy each player employs probability to select a strategy for each play of the game.
The dominance rule used for reduction of the size of mixed strategy game was also discussed. If
the elimination of dominated strategies can reduce a larger game to 2 X 2 game, an algebraic
solution procedure is used to find a solution. The solution of the n X 2 or m X 2 game, using the
graphical method was also discussed.

3.8 Answers for Check Your Progress


Check your progress 1
Answers
1. False
2. True
3. (b)
4. (d)

Check your progress 2


Answers
1. (a)
2. (c)

Check your progress 3


Answers
1. (d)
2. (d)

Check your progress 4


Answers
1. (b)
2. (d)

Check your progress 5


Answers
1. (a)
2. (c)

3.9Glossary
Game theory: The study of decision situations in which two or more players compete as
adversaries.
Two-person Zero-sum game: A game with two players in which the gain to one player is equal
to the loss to the other player.
Optimal Strategy: A strategy in which the player can achieve the maximum payoff is called the
optimal strategy.
Saddle point: A condition that exists when pure strategies are optimal for both players in a two-
person zero-sum game.
Payoff Matrix: The tabular display of the payoffs of the players under various alternatives is
called the payoff matrix.
Pure strategy: A game solution that provide a single best strategy for each player.
Mixed strategy: A game solution in which the player randomly selects the strategy to play from
among several strategies with probabilities.
Dominated strategy: A strategy is dominated if another strategy is at least as good for every
strategy that the opposing player may employ.

3.10 Assignment
1 What is game theory? What do you understand by ‘zero-sum” in the context of game
theory?
2 Explain the following: Saddle point, Pure strategy, Mixed strategy
3 Explain the concept of dominance with examples.
4 For the following Two-person, zero-sum game, find the optimal strategies for the two
players and value of the game:
B’s Strategy
B1 B2 B3
A’s Strategy A1 5 9 3
A2 6 -12 -11
A3 8 16 10
5 Solve the following game graphically
Player B’s Strategy
B1 B2
A1 3 4
Player A’s Strategy A2 -3 12
A3 6 -2
A4 -4 -9
A5 5 -3

3.11 Activities
Discuss applications of game theory with examples

3.11Case Study

Two television stations in a market compete with each other for viewing audience. Local
programming options for the 5.00 pm weekday time slot include a sitcom rerun, an early news
program or a travel show. Assume that each station has the same three programming options and
must make its preseason program selection before knowing what the other television station will
do. The viewing audience changes in thousands of viewers for station A as follows:

Station B
Sitcom, a1 News, a2 Travel a3
Sitcom, a1 70 80 50
Station A News, a2 90 60 95
Travel , a3 105 90 65

Determine the optimal programming strategy for each station. What is the value of the game?

3.12 Further Reading

1. Operations Research, By Hamdy A Taha, Pearson Education


2. Operations Research theory and Applications by J.K. Sharma, Macmillan India Ltd.
3. Quantitative techniques in Management, by N.D. Vora, McGraw hills
4. Quantitative methods for business, by Anderson, Sweeney and Williams, Thompson
5. Quantitative Analysis by Render, Stair, Hanna & Badri, Pearson Education
6. Operations Research by Pradeep Pai, Oxford University Press
.
Block Summary

In this block, we discussed some operation research techniques in detail. In the first unit business
situations pertaining to managing projects were discussed. The project management techniques
involve constructing network diagram using rules of networking. The project networking
technique with multiple estimates of activity time was also explained. In the second unit the
waiting line concept was introduced with its applications. The unit included some most
commonly used models of queuing. In the last unit game theory and its applications were
discussed. The consequences of interplay of combination of strategies with competitor were
explained and various methods employed to derive the optimal strategy were covered.
Block Assignment

Short Answer Questions

1. Differentiate between CPM and PERT


2. Define- Dummy activity, predecessor activity, Successor activity
3. What is reneging and balking in waiting line models
4. Give an example from real life for the following- (i) First–come-first-served (ii) Last-
come –first served.
5. Describe Maximin and minimax principles of game theory

Long Answer Questions


1. Which assumptions are necessary to employ (M/M/1) waiting Line Model? Give few
examples of it.
2. What are the three time estimates used in the context of PERT? How are the expected
duration of a project and its standard deviation calculated?
3. Explain with examples the concept of dominance in games theory.
4. The following list of activities must be accomplished in order to complete a construction
project:
Activity Precedence Duration Activity Precedence Duration
A - 3 F C 7
B - 8 G E, F 5
C A,B 4 H D,F 6
D B 2 I G,H 8
E A 1 J I 9

a. Draw the network and find the project completion time


b. Calculate total float for each of the activities

5. For the following Two-person, zero-sum game, find the optimal strategies for the two
players and value of the game:
B’s Strategy
B1 B2 B3
A’s Strategy A1 30 40 -80
A2 0 15 -20
A3 90 20 50

6. Customers for a local bakery arrive randomly following a Poisson process. The single
salesman can attend customers at an average rate of 20 customers per hour, the service
time being distributed exponentially. The man arrival rate of customers is 12 per hour.
Determine the following?
a. The mean number of customers in bakery
b. The mean time spend by customers in bakery
c. The expected number of customers waiting to be queued
d. The mean waiting time of a typical customer in the queue

You might also like