Professional Documents
Culture Documents
Contents
Unit 3 TRANSPORTATION
Introduction, Basic structure of transportation, Transportation problem- Initial Basic
feasible solution (North west corner rule, Least Cost Rule, Vogel’s approximation
method), Test for optimality (The Modified Distribution (MODI) method), Special cases
of transportation
Unit 4 ASSIGNMENT
Introduction, Basic structure of assignment, Approach of the Assignment model, Solution
Method (Hungarian method), Special cases of Assignment
_______________________________________________________________
Introduction, Basic Concepts in Game theory, Two- person zero-sum game, Game with
no Saddle Point, Principle of Dominance, Solution of 2Xn and m X 2 games,
Block no.1 Introduction to Quantitative Management
and Statistical methods
_________________________________
Block Introduction
In this block, an introduction to quantitative methods will be given. The basic difference between
statistics and operations research will be discussed. The role and importance and its application
of quantitative methods in business will be explained. In the second unit, the meaning and
importance of measures of central tendency will be discussed. Various measures of central
tendency and its comparative analysis will be covered. In the third unit discrete probability
distribution and its various types will be discussed. In the last unit continuous probability
distributions and its various applications will be covered.
Block Objective
• Understand the meaning of quantitative methods
• Identify the various situations where discrete probability distributions can be applied.
Block Structure
1.1 Introduction
1.12 Glossary
1.13 Assignment
1.14 Activities
1.15 Case Study
1.1 Introduction
Decision making is an integral part of management of an organization. Every day business
managers are required to make decisions. The key managerial functions of planning, organizing
directing and controlling, requires management to be engaged continuously in the process of
decision making pertaining to each of them. So we can say that management can be regarded an
equivalent to decision making.
Historically, decision making was considered purely as an art, acquired over period of time based
on experience. Various styles of decision making were observed in solving similar managerial
problems by different people in real business situations. Many times managers resort to their
“instincts” to make decisions (unstructured decision making). However, the environment in
which the management has to operate these days is complex and fast changing. There is a great
requirement for augmenting the art of decision making using systematic and scientific methods.
Most decisions cannot be taken on the basis of ‘rule of thumb’ or common sense or snap
judgment. For businesses, a single wrong decision may have long term painful implications. The
present day managers cannot work on trial and error method. A systematic approach to decision
making is also necessary, as the cost of making errors may be too high and at times irreversible.
Thus the managers in the business world should understand the importance of scientific
methodology of decision making. It means defining the problem in a clear manner, collecting
required data, analyzing the data thoroughly, deriving and forming conclusions about the data
and finally implementing the solution.
Although qualitative approach are inherent in the manager and usually increase with experience,
the skills of the quantitative approach need to be learned by studying its assumptions and
methods. A manager who is knowledgeable in Quantitative methods can compare and evaluate
the qualitative and quantitative sources of recommendations and finally combine the two sources
to choose the best possible decision.
i) Statistical Methods
1.3.1Statistical methods
Statistics is a science dealing with the collection, analyze, interpretation and presentation of
numerical data. As an example, let us suppose that a company is interested in knowing the
satisfaction level its consumers.The first step will be data collection on satisfaction level, the
factors of satisfaction and other variables related to consumer behavior.The data so obtained can
be organised on the basis of various demographic and classification variables like- age, income,
gender, education level, region etc.Thisorganised data may now be presented by means of
tabular data or various types of graphs to facilitate analysis. The average satisfaction level can be
derived and further compared on the basis of measured variables like age.This information will
help to determine if a particular age group is more satisfied as compared to others. Similarly
various kinds of analysis will give insights to drawing conclusions about the population being
studied. This will further help in decision making related to improvement of satisfaction level of
customers of the targeted product.
The same data when can be called as primary or secondary, based on the difference of who is
using it. For example a researcher wants to study the economic conditions of laborers’ in India.
If the researcher collects the data directly using a questionnaire, it is called ‘primary data’.
However if some other researcher uses this data for some other purpose subsequently, then the
same data becomes “secondary data”.
Whenever one is doing research, first it must be checked whether any secondary data is available
on the subject matter of interest which can be used, as it will save a lot of time and money.
However the data must be verified thoroughly for its reliability and accuracy. Its relevance and
the context under which it is collected should also be verified, since it was originally collected
for another purpose. The researcher would need to collect original data according to his
objectives, when either secondary data is not available or is not reliable.
There are many international bodies who collect great amount of data regularly and publish like:
International Monetary fund(IMF), World Health organization(WHO, Asian Development Bank,
International LabourOrganisation , United nations organization , world meterological
organization, Food and agriculture organization(FAO),etc., Government and its many agencies:
Reserve Bank of India, Census Commission, Ministries-Ministry of Economic Affairs,
Commerce Ministry; Private Research Organisations, Trade Associations, etc. Examples of
government publications in India are reports on currency and finance, India trade journal,
statistical abstract of India, Indian customs and central excise tariff, reservebank of India
bulletin, agricultural statistics of India, economicsurvey, and Indian foreign statistics, etc.
1. Problem Formulation. The first step in operations research is to develop a clear and
concise statement of problem. It is essential to identify and understand the root problem
to get the right answer to solve the problem. The symptom should not be confused with
the problem. For example higher production cost is a symptom, where the underlying
problem may be of – improper inventory levels, excessive wastage, poor quality control,
etc. The symptoms are only an indication of the problem and hence the manager should
go beyond the symptoms to identify the real cause of the problem. Also there may be
multiple problems and one may be related to other. The organization often selects those
problems whose solution would either result in increasing profit or decreasing cost. So it
is imperative for an analyst to have an extensive interaction with the management
involving selection and interpretation of the available data This step often involves
various activities like- site visit, meetings, research, conferences, observations etc, These
activities which provide the analyst with the required information to formulate the right
problem.
2. Model Building: Once a problem is identified, the next step is to develop a model. A
model is a representation of some abstract or real life problem. The models are basically
mathematical models, which describes systems, process in the form of equations,
formula/relationships. The activities in this step involve defining the variable, studying
their relationship and formulating equations to represent the problem. The model will be
tested in different environmental constraints and revised in order to work.
3. Obtaining the input data: The next step is to obtain the data to be used in the model as
input. The data should be accurate, relevant and complete in all respect. The quality of
the input data will decide the quality of output. A number of resources including
company reports and documents, interviews with company employees may be used for
data collection.
4. Solution of the Model: The next stage of analysis is finding the solution nd interpreting
it in the context of the problem. A solution to a model means determination of a specific
set of decision variables that would give a desired level of output. The desired level of
output is the level which ‘optimises’. Optimisation means maximization of the goal
attainment from a given set of resources or minimsation of cost as will satisfy the
required level of goal attainment.
5. Model Validation: The validation of the model means whether the developed model is
adequately predicting the behavior of the actual system, it is representing. It involves
checking the reliability and ascertaining if the structural assumptions of the model are
met. It’s a normal practice is to test the validity by comparing the performance of the
past data available with respect to the actual system
6. Implementation: The final step is the implementation of the results. It is the process of
incorporating the developed model as a solution in the organization. The techniques and
methods of operation research are based on mathematical concepts, and neglect the
human aspects, which are most important at the time of implementation. The impact of
the decision will be influenced by the level of motivation, resistance to change, desire to
be informed among employees. It will be very important to tactfully handle these issues
for successful implementation of the solution. A model which gives average theoretical
advantage but implementable is better than one which ranks high on theoretical
advantage but cannot be implemented.
Check your Progress1
1. Individual respondents, focus groups, and panels of respondents are categorised as
a) Primary Data Sources
b) Secondary Data Sources
c) Itemized Data Sources
d) Pointed Data Sources
Inferential Statistics: If a research gathers data from a sample and uses the statistics generated
to reach conclusions about the population from which the sample was taken, the statistics are
inferential statistics. The data gathered are used to infer something about a large group.
Continuing with the same example if the professor uses statistics on average grade achieved by
one class to estimate the average grade achieved by all five sections of the same english course.
The process of estimating this average grade would be called as inferential statistics. Inferential
ststistics are sometimes also referred to as inductive statistics. We need to understand word
‘statistic’ and parameter’ to understand inferential statistics. A statistic is a descriptive measure
computed from a sample of data.
• A statistic is a descriptive measure computed from a sample of data. For eg. mean ( x
) and standard deviation (s) of a sample are known as ‘Statistic’.
• A parameter is a descriptive measure computed from an entire population of
data.Foreg. mean (µ) and standard Deviation () of a population are known as
‘Parameter’.
Iconic models represent a system the way it is, but in different size. They are essentially the
scaled up/down versions of the particular thing they represent. It is obtained by reducing or
enlarging the size of the system. In other words they are images. A model of a proposed building
by an architect, model of solar system, model of molecular structure of a chemical, a toy
aeroplane are some examples of iconic model. Maps, photographs, drawings may also be
categorized as iconic models as they look like what they represent except in size. The advantage
of iconic models is that they are specific and represent the thing visually. But the disadvantage it
they cannot be manipulated for experimental purposes. They cannot be used to study the changes
in the operation of a system.
The analogue models use one set of properties to represent another set of properties. After the
problem is solved, it is interpreted in terms of the original system. For example the electrical
network model may be used as an analogue model to study the flows in a transportation system.
The contour lines on a map are analogues of elevation as they represent the rise and fall of
height. In general the analogue models are less specific and concrete as compares to the iconic
models and can be easily manipulated.
In symbolic models letters, numbers and other types of mathematical symbols are used to
represent variables and the relationship between them. These are the most general and abstract
type of models. These models can be verbal or mathematical. The verbal models represent a
situation in spoken language or written words, whereas, mathematical models uses mathematical
notations to represent the situation. The difference between the two can be understood by taking
an example of measuring area of rectangle. A verbal model would express it as: The area of the
rectangle (A) is equal to multiplication of length (L)of the rectangle by its breadth(B) . Whereas
the mathematical model is represented as: A= L x B. Both the models yield same results,
however a mathematical model is more precise.
Symbolic models are used in Operations research as they are easier to manipulate and yield
better results as compared to iconic or analogue models.
One a data is collected, it needs to be summarized and presented to the decision maker in a form
theta is easy to understand and comprehend. Tabulation helps this process through effective
presentation. Classification of the data showing the different values of the variable and their
respective frequencies of occurrence is called frequency distribution of the values. There are two
kinds of frequency distribution- discrete frequency distribution and continuous frequency
distribution. Graphical representation is more effective in communicating the information.
Through graphs and charts, the decision maker can often get an overall picture of the data and
reach very useful conclusions merely by studying the chart or graph.
The concept of central tendency plays an important role in the study and application of statistics.
There is an inherent tendency of the data to cluster or group around central value. This behavior
of the data to concentrate the values around central part of data is called as ‘Central tendency’ of
the data. Measures of central tendency enable to find that single value at which the data is
considered to be concentrated. Measures of central tendency helps to compare two or more sets
of data, for example average sales figures of two months. There are three common measures of
central tendency- Mean, Median and Mode. Mean is the most widely used measure. Arithmetic
mean is the average of a group of numbers and is computed by summing all numbers and
dividing by the number of observations. The median is the middle value in a set of data that has
been ordered from lowest to highest (ascending) or highest to lowest (descending).It is the value
that splits ordered data into two equal parts. The mode is the most frequently occurring value of a
set of data.
Measures of variability
Measures of variability explain the spread or dispersion of a set of data. It explains the variation
in the values and how different the values are from the mean. Usually measures of variability are
used together with the measures of central tendency to make a complete description of the data.
There are a number of measure of dispersion like- Range, Inter quartile range, mean absolute
deviation, variance and standard deviation
Probability Distribution
Correlation
Correlation is a measure of the degree of relatedness of variables.For example, how strong is the
correlation between the producer price index and the unemployment rate? In retail sales, are
sales related to population density, number of competitors, size of the store, amount of
advertising, or other variables? The correlation coefficient measures the degree of association of
one variable with other. The Pearson product-moment correlation(r) is used, when both
variables being analyzed have at least an interval level of data. The term r is a measure of the
linearcorrelation of two variables. It is a number that ranges from -1 to 0 to +1, representing the
strength of the relationship between the variables. An r value of +1 denotes a perfect positive
relationship between two sets of numbers. An r value of -1 denotes a perfect negative
correlation, which indicates an inverse relationship between two variables. An r value of 0 means
no linear relationship is present between the two variables.
Regression
Regression analysis is the process of developing a model to predict the value of a numerical
variable based on the values of other variables (one or more). The most elementaryregression
model is called simple regression or bivariate regression involving twovariables in which one
variable is predicted by another variable. In simple regression, the variable to be predicted is
called the dependent variable and is designated as y. The predictor iscalled the independent
variable, or explanatory variable, and is designated as x. In simpleregression analysis, only a
straight-line relationship between two variables is examined. In multiple regression, more than
one independent variables are used to predict the dependent variable.
Forecasting
Forecasting is the art or science of predicting the future values of a variable. Forecasting methods
can be classified as qualitative and quantitative. The quantitative methods can be used only
when the variable under study can be quantified and the historical data is available. A time series
data is a set observation of a variable measure over a period of time at regular intervals. The
objective of time series method is to discover a pattern in the historical data and then extrapolate
this pattern into the future.
Decision Theory
Decision theory also called as decision analysis, is used to determine optimal strategy where a
decision maker is faced with several decision alternatives and an uncertain pattern of future
events. All decision making situations have usually two or more alternative courses of action
available to the decision maker to choose from. There are various possible outcomes, called
states of nature, which are beyond the control of decision maker. A decision may be defined as
the selection of an act which is considered to be the best according to a predefined standard, from
the available options.
Index Number
Index number is a ratio of a measure taken during one time period to that same measure taken
during another time period, usually denoted as base period. The ratio is often multiplied by 100
and expressed as a percentage. These are very useful to reflect the inter-period differences. Using
index numbers, a researcher can transform the data into values that are more usable and make it
easier to compare other years to one particular key year. Index numbers are widely used among
the world to relate information about stock prices, inflation, sales, exports, imports, agriculture
prices etc. Some examples of specific indexes are employment cost index, price index for
construction, producer price index, consumer price index etc. For example, if the Consumer
Price Index for year 2020 is 150, it means the prices are gone up by 50 %. As the Consumer
Price Index (CPI-U) is compiled by the Bureau of Labor Statistics and is based upon a 1982 Base
year of 100.
Check Your progress
Check your progress 3
The variables whose calculation is done according to the height, length, and weight are categorised as
1. The variables whose calculation is done according to the height, length, and weight are
A) Discrete Variables
B) Flowchart Variables
categorised as
C) Measuring Variables
a) Discrete Variables
D) Continuous Variables
b) Flowchart Variables
c) art
the Measuring
or science Variables
of predicting
d) Continuous Variables
2. The art or science of predicting the future values of a variable is called
a) Regression
b) Forecasting
c) Probability distribution
d) Index numbers
Linear programming:
It is a mathematical modeling technique for selecting the best alternative from a set of feasible
alternative, in situations where the objective function as well as the constraints can be expressed
as linear mathematical function. The objective function may be maximization of profit /sales or
minimization of cost/time etc. There are many methods to solve a linear programming problem.
Transportation:
The transportation problem arises in planning for the distribution of goods and services from
various supply locations to different demand locations. Normally the quantity of goods available
at supply location (origin) is limited and the quantity of goods required at demand location
(destination) is known. Mostly the objective is to minimize the total transportation cost of
shipping the goods from origin to destination
Assignment
An assignment problem arises in many decision making situation in an organization like
assigning jobs to machines, workers to machines, clerk to counters, sales personnel to sales
territories etc. It is a special type of linear programming, with the constraint that one job can be
assigned to one and only one machine.
Game theory
Game theory is used to make decisions in conflicting situations in which where there are one or
more players/ adversaries/ opponents. Each player selects a strategy independently without
knowing in advance the strategy of other player or players. The combination of the competing
strategies provides the value of the game to the players. Game theory applications have been
developed for situations in which the competing players are teams, companies, political
candidates, armies or contract bidders.
Project scheduling
Managers are responsible for planning, scheduling and controlling projects that consists of
numerous jobs or tasks performed by various departments or individuals. The Program
Evaluation and Review technique (PERT) and Critical Path method (CPM) are extremely helpful
in these situations. The objective is to complete the project on time, adhering to the precedence
requirements (which mean some activities should be completed, before other activities can be
started).
Simulation
Simulation is one of the most widely used quantitative approaches of decision making. It
involves developing a model of some real phenomenon and then performing experiments on the
model evolved. It is a descriptive and not an optimizing technique. In simulation a given system
is copied and the variable and the constants associated with it are manipulated in the artificial
environment to study the behavior of the system.
Quantitative methods provide the managers with a variety of tools from statistics and operational
research for handling problems in modern business a scientific way.
1. Give accurate and specific description: The facts can be conveyed in a precise form when
stated quantitatively using statistics. For example the statement that infant mortality rate is 30 %
in 2018, as compared to 35 percent in 2015, is more specific than stating that the infant
mortality rate in 2018 had decreased in comparison to year 2015.
2. Convert data into information: Statistics help in reducing the amount of data collected and
convert it to more meaningful information for making decisions. For example the census data of
individual household on the number of members is a huge mass of data and it will be difficult to
draw any conclusions without applying statistics.
3. Facilitate Comparison of data: It helps in the comparison of data, as the data is collected in
the form of numbers. The present data can be compared with the previously collected data to
study the pattern of increase or decrease in a phenomenon. For example there can be a
comparison of month-wise sales figure data of a company to identify the trend.
4. Forecast future events: Statistical methods are very useful in predicting a future events. For
example to take the decision on production scheduling, an automobile manufacturer would like
to know the past sales figures. Based on these figures, future sales can be predicted and
accordingly the required number of automobiles can be manufactured
Tools for scientific analysis: Operations research models, provides a systematic, scientific and
logical way of understanding and solving problems. It is not possible to take decisions based on
intuitions due to increased complexities of business. These techniques help the decision maker to
provide the description and solution of the problem more precisely.
Choosing an optimal business strategy– Using operations research techniques like Game
theory, it is possible to determine the optimal strategy for an organization that is facing
competition from its rivals with conflicting interests.
Facilitate and improve the quality of decision making-A decision maker can use various
mathematical models to take better informed decision in the face of uncertainty. The operation
research techniques like decision theory, improve the quality of decision making. Multiple
variables or resources can be formulated and manipulated as a model to take optimum decisions.
Various methods used in statistics and operations research in discussed in brief. The benefits
and advantages of quantitative methods along with their applications in various functional areas
were also covered in this unit. The importance and complexity f decision making process has
resulted in wide application of quantitative techniques.
1.12 Glossary
Statistics: Statistics is a science dealing with the collection, analyze, interpretation and
presentation of numerical data.
Descriptive statistics:Data gathered on a group to describe or reach conclusions about that same
group
Inferential Statistics: Data gathered from a sample and statistics generated to reach conclusions
about the population from which the sample was taken.
Primary Data: The data used in the study is collected specifically for the purpose of the study
Secondary Data: The data was collected for some other purpose and is derived from the other
sources.
Statistic: It is a descriptive measure computed from a sample of data.
Parameter: It is a descriptive measure computed from an entire population of data.
Random variable is a numerical description of the outcome of an experiment.
Discrete Random Variable: Therandom variable that can take limited number of values
basically whole numbers
Continuous random Variable: The random variable can take any value over a range (decimal
values also).
Operations Research:It is a method of employing mathematical representations or models to
analyze business problems to take management decisions.
Model: A representation of real object or situation
Iconic Model: A physical replica or representation of a real object
Analog Model:Analogical models are a method of representing a phenomenon of the world, by
another, more understandable or analyzable system.
Mathematical Model:Mathematical symbols and expressions used to represent a real situation
1.13Assignment
1.14Activities
Take an example of a major decision you have taken recently. List the steps you had taken to
reach the final decision.
1.15Case Study
A manufacturing company makes electric wiring, which it sells to contractors in the
construction industry. Approximately 900 electric contractors purchase wire from the company.
The Director marketing wants to determine the electric contractor’s satisfaction. He developed a
questionnaire that yields a satisfaction score between 10 and 50 for participant responses. A
random sample of 35 of the 900 contractors is asked to complete the survey. The satisfaction
score for the 35 participants are averaged to compute average satisfaction score.
1. Describe population and sample for this study.
2. What will be the statistic and parameter for this study
3. How can finding of this study used in decision making?
1.16 Further Reading
2.0Learning Objectives
2.1Introduction
2.2Measures of Central Tendency
2.4 Median
2.6 Quartile
2.1 Іntroductіon
In the introductory chapter, an overview of the various types of statistical methods used in
management decision making was explained. The purpose of descriptive statistics is to describe
and summarise the data. Descriptive statistics include various measures like- Measures of central
tendency, measures of variation, measure of shape and measures of kurtosis. Measure of central
tendency is one of the most important and widely used tool for describing and summarizing the
data. In this unit we will be exploring the concept of central tendency and various measures used
to measure central tendency. The objective is to identify a single value which can act as a
representative of the given data. This value can be used to make conclusion and decision related
to the entire data set. The computation of various measures is different for ungrouped and
grouped data and hence will be discussed separately.
Measures of central tendency, enables us to get an idea of the entire data from a single value
where the data is considered to be concentrated. For example, it is impossible to remember the
sales figures of various retail outlets in a region. But the average could be used to make
conclusions about the sales of the entire region. The average condenses a great amount of data,
into a single representative value, so that data can be summarized easily. Measure of Central
tendency also enables to compare two or more sets of data. For example the average sales figures
of two brands in the same product category can be compared.
2.2.2Characteristics of a good measure of central tendency
A good measure of central tendency should posses as far as possible, the following
characteristics:
• Easy to understand
• Easy to compute
• Based on all the observations
• Uniquely defined
• Possibility of further algebraic treatment
• Not unduly affected by extreme values
This method is illustrated with the help of the given data on the age group of people in a area.
18-24 17 48-54 30
24-30 22 54-60 32
30-36 26 60-68 21
36-42 35 68-72 15
42-48 33
18-24 17 21 357
24-30 22 27 594
30-36 26 33 858
36-42 35 39 1365
42-48 33 45 1485
48-54 30 51 1530
54-60 32 57 1824
60-68 21 63 1323
68-72 15 69 1035
fi= 231 fiMi= 10,371
In calculating arithmetic mean, an equal importance is given to all the observations. But there are
situations where relative importance of different values is not the same. In such case, weighted
arithmetic mean need to be used. The procedure is similar to the calculation of grouped data
arithmetic mean, where frequency is used as the weight associated with the class interval. For
example for the data value x1,x2 , x3 ….. xnand associated weights w1, w2 , w3 ….. wn, the
weighted arithmetic mean can be computed using the formula:
∑𝑤𝑖 𝑥𝑖 𝑤𝑖 𝑥𝑖 × 𝑤2 𝑥2 ×⋅⋅⋅⋅⋅⋅⋅⋅× 𝑤𝑛 𝑥𝑛
𝜇𝑤 = =
∑𝑤𝑖 𝑤1 + 𝑤2 +⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 𝑤𝑛
You are aware about the use of weighted averages, when the various components of
evaluation are not equally important. For example your final grade is composed of 30
percent of mid term score, 50 percent of final exam score and 20 percent for assignment.
Then the final grade will be calculated by multiplying the score (xi) by the weight (wi) of
each score :
When we are dealing with quantities that change over a period of time and we need to find the
average rate of change, such as average growth rate or depreciation rate over a period of time. In
such cases, simple arithmetic mean is inappropriate, as it will result in wrong answer. The
appropriate measure of central tendency will be geometric mean.
The geometric mean is defined as nth root of the product of ‘n’ values of the data. If
x1,x2,x3…..xnare the values of the data then Geometric mean is equal to:
𝐺𝑀 = 𝑛√𝑥1 × 𝑥2 × 𝑥3 × … . .× 𝑥𝑛
When the number of observations are more, to simplify the calculations, logarithmic
transformations can be applied. Taking log on both the sides, the formula becomes:
∑𝐿𝑜𝑔(𝑥𝑖 ) 1
𝐿𝑜𝑔(𝐺𝑀) = = (𝐿𝑜𝑔𝑥1 + 𝐿𝑜𝑔𝑥2 + 𝑙𝑜𝑔𝑥3 +⋅⋅⋅⋅⋅⋅⋅⋅⋅ +𝑙𝑜𝑔𝑥𝑛 )
𝑛 𝑛
∑𝐿𝑜𝑔(𝑥𝑖 )
𝐺𝑀 = 𝐴𝑛𝑡𝑖𝑙𝑜𝑔 { }
𝑛
Geometric mean is useful to find the average percentage increase in sales, production, population
etc. It is the most representative average in the construction of index numbers. When large
weights are to be given to smaller values and small weights to larger values,the most appropriate
average to be used is geometric mean. Let’s take an example to understand computing of
geometric mean.
Inflation rate in percentage for the past six months is given as 5.5, 6.2, 7.2, 6, 6.5 and 5.9. Find
average inflation rate over the past six months
First, we find the index by dividing the percentage rate by 100 and then adding 1. Then we take
the GM of this index as average index. From this we can find out the average inflation rate.
6 6
𝐺𝑀 = √1.055 × 1.062 × 1.072 × 1.06 × 1.065 × 1.059 = √1.4359 = 1.062
Harmonic mean is defines as the reciprocal of the arithmetic mean of the reciprocals of the
individual observations. If x1,x2,x3…..xn are the values of the data then Harmonic mean is given
by the formula:
𝑛 𝑛
𝐻𝑀 = 1 1 1
= 1
(𝑥 + 𝑥 + ⋯ ⋯ + 𝑥 ) ∑
1 2 𝑛 𝑥𝑖
Harmonic mean is appropriate if the data values are ratios of the two variables with different
measures called rates. The harmonic mean is very useful for computing average speed of a
journey or average price of a product at which it is sold. In finance harmonic mean is used to
determine the average of financial multiples like P/E ratio.
A journey from place X to Y is completed using four different cars. The average speed of each of
the car is 50 km/hr,75km/hr, 60 km/hr and 80km/hr. Find the average speed of the journey.
The average speed of the journey is calculated as :
4
𝐻𝑀 = 1 1 1 1 = 64 𝑘𝑚/ℎ
+ + +
50 75 60 80
Like arithmetic mean and geometric mean, in harmonic mean also are the values are used for
computation of the average. However harmonic mean cannot be used when one or more
observations have zero value or the observations can take both positive and negative values.
Harmonic mean had very limited applications in business
3. The following frequency distribution has been constructed from about the Air
transport traffic data. Calculate the arithmetic mean
No of passengers 20-30 30-40 40-50 50-60 60-70 70-80
travelling
No of airports 8 7 1 0 3 1
4. The management of a restaurant has employed 2 waiters, 5 cooks and 10 waiters. The
monthly salaries of the managers, cooks and waiters are 30000, 20000 and 10000 per
month respectively. Find the mean salary paid per month by the management.
2.4 Median
Median is a measure of central tendency different from all the averages we have discussed
so far. Median is the middle value in a set of data that has been arranged in ascending or
descending order. While computing various types of means all the values in the data set are
used, whereas median is a single value from the data set that is the middle most or central
item in the set of numbers. Half of the values lie above the point and the other half lie below
it measures the central item in the data
2.4.1 Median for Ungrouped data
To find the median of ungrouped data, first arrange the data in ascending or descending
order. If the data set contains an odd number of values, the middle item (median) is one
of the original observations. If there is an even number of values, the median is the
average of the two middle observations. The formula for median is:
𝑁+1
𝑀𝑒𝑑𝑖𝑎𝑛 = ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑎𝑟𝑟𝑎𝑦
2
Suppose we want to find the median of a data set containing seven observations. Then as
per the above formula,the median is the (7+1)/2= 4th value in the data set. Lets take an
example of the data on time taken to complete a task daily. First the data has to be
arranged in ascending order:
Ordered data
29 31 35 39 39 40 43 44 44 52
𝑁+1 (10 + 1)
𝑀𝑒𝑑𝑖𝑎𝑛 = ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = = 5.5𝑡ℎ 𝑖𝑡𝑒𝑚
2 2
As the median is at the 5.5th item, we will be taking an average of 5th and the 6th value,
which is 39 and 40. Therefore the median is (39+40)/2= 39.5. The median of 39.5 means
that for half of the days, the time taken to do the task is less than or equal o 39.5minutes
and for half of the days the time taken to do the task is greater or equal to 39.5 minutes.
For the grouped data, we first find the N/2 value. Then from the cumulative frequency we
find the class in which N/2th item falls. Such a class is called median class. Then the
median is calculated using the following formula:
𝑁
− 𝑐𝑓𝑝
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 (𝑤)
𝑓𝑚𝑑
where, L= lower limit of the median class
cfp= cumulative frequency of the class preceding the median class
fmd = frequency of the median class
w = width of the class .
1-3 4 7-9 19
3-5 12 9-11 7
5-7 13 11-13 5
To facilitate the process of locating the median class, let’s find the cumulative frequency.
Class Interval Frequency Cumulative frequency
1-3 4 4
3-5 12 16
5-7 13 29
7-9 19 48
9-11 7 55
11-13 5 60
Median = N/2th value= 60/2= 30th Value. Let’s understand how to locate the median class
using the cumulative frequency column. It can be seen that 1st to 4th value lies in class 1-3,
from 5th to 16th in the second class, 17th to 29th in third class and from 30th to 48th in the
fourth classand similarly the rest of the values. Thus the 30th value lies in the class interval
7-9.
𝑁
− 𝑐𝑓𝑝
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + 2 (𝑤)
𝑓𝑚𝑑
60
− 29
2
=7+ × (2) = 7 + 0.105 = 7.105
19
The median value of the unemployment rates is 7.105.
Like the grouped arithmetic mean, the median is a approximate value. It is based on the
assumption that the actual value fall uniformly across the median class interval, which may
not be always true.
Mode is rarely used as a measure of central tendency for ungrouped data, as sometimes a
single unrepresentative value might have occurred just by chance. For example as in the
data series 1,2,2,3,3,4,4,5,5,6,7,7,8,9,9,12,12,and 12 , the mode is 9 as it occurs maximum
number of times. But as it can be observed it is not representing of the central part of the
data and most of the values are actually below 10.
When data is grouped in the form of frequency distribution, it is assumed that the mode is
located in the class with the most items. The class with the highest frequency will be
called the modal class. To determine the mode from the Modal class, the given formula
will be used:
𝑑1
𝑀𝑜𝑑𝑒 = 𝐿 + ( )𝑤
𝑑1 + 𝑑2
The modal class is 15 -20, as the highest frequency is 10. Let’s substitute the values in the
given formula
𝑑1 = 𝑓1 − 𝑓0 = 10 − 0 = 10 𝑑2 = 𝑓1 − 𝑓2 = 10 − 9 = 1
𝑑1
𝑀𝑜𝑑𝑒 = 𝐿 + ( )
𝑑1 + 𝑑2
10)
= 15 + ( ) 5 = 19.55
10 + 1
The mode of the age of students enrolled for the programme is 19.55 years.
Check your progress 3
1. Compute mode
Frequency: 1 6 10 9 3 1
2.6 Quartiles
Quartiles are related positional measures of central tendency. There are useful and quit
frequently used measures. The most familiar positional averages are – quartiles, deciles and
percentiles
Quartiles: Quartiles are values that divide the data into four equal parts. To divide data
into four parts we need three partitions and these are called - Quartile 1, Quartile 2 and
Quartile 3. The first quartile Q1 is such that 25% of the values are smaller and 75 % of the
observations are higher than this value.The second quartile Q2 is the median as 50% of the
values are smaller and 50 % of the observations are larger than it.The third quartile Q3
divides in such a way that 75% of the values are smaller and 75 % of the observations are
larger than Q3.
𝑁
The quartile is located at the 𝑖 4 th item of the data set. The class in which the quartile lies is
known as the quartile class. The formula of computing quartile for grouped data, similar to
median formula is as follows:
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3
𝑓𝑞
where, L = lower limit of the quartile class
cfp= cumulative frequency of the class preceding the quartile class
fq = frequency of the quartile class
w = width of the class .
Deciles: Deciles are values, that divide the data into ten equal parts. Since we need nine
points to divide data set into ten parts, there are nine deciles denoted as D1, D2, D3, …..D9.
𝑁
The decile is the 𝑖 10th item of the data set, wherei=1 ,2, 3,….9. The class in which the
decile falls is known as the decile class. The formula of computing decile for grouped data
is:
𝑁
𝑖 10 − 𝑐𝑓𝑝
𝐷𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3 … .9
𝑓𝑑
where, the symbols have usual meaning and interpretation
Percentiles: Percentiles are the value, which divides the data into hundred equal parts.
There are ninety nine percentiles, denoted as P1, P2, P3……P99.The percentile is located at
𝑁
the 𝑖 100th item of the data set. The formula is:
𝑁
𝑖 100 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤) 𝑓𝑜𝑟 𝑖 = 1,2,3 … … .99
𝑓𝑝
where, L = lower limit of the quartile class
cfp= cumulative frequency of the class preceding the quartile class
fp = frequency of the quartile class
w = width of the class.
To illustrate the computation of quartiles, deciles and percentiles, consider the following
data on sales of companies in lakhs.
Frequency: 12 18 27 20 17 6
Solution:
0-10 12 12
10-20 18 30
20-30 27 57
30-40 20 77
40-50 17 94
50-60 6 100
𝑁 100
Q1 = (𝑖 4 )thitem = (1 4 ) = 25th item which falls in the class 10-20 , as the cumulative
frequency of this class is 30. Substituting the relevant values in the formula
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤)
𝑓𝑞
100
1 − 12
4
𝑄1 = 10 + × (10) = 17.22
18
This value of Q1 suggest that 25% of the company’s sales are Rs. 17.22 lakhs or less than
that and 75% of the company’s sales figures are more than that.
𝑁 100
Q3 = (𝑖 4 )thitem = (3 4 ) = 75th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 4 − 𝑐𝑓𝑝
𝑄𝑖 = 𝐿 + (𝑤)
𝑓𝑞
100
3 − 57
4
𝑄3 = 30 + × (10) = 39
20
This value of Q3 suggest that 75% of the company’s sales are Rs. 39 lakhs or less than that
and only 25% of the company’s sales figures are more than that.
𝑁 100
D6 = (𝑖 10)thitem = (6 10 ) = 60th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 10 − 𝑐𝑓𝑝
𝐷𝑖 = 𝐿 + (𝑤)
𝑓𝑑
100
6 − 57
10
𝐷6 = 30 + × (10) = 31.5
20
This value of D6 suggest that 60% of the company’s sales are Rs. 31.5 lakhs or less than
that and only 40% of the company’s sales figures are more than that.
𝑁 100
P80 = (𝑖 100)thitem = (80 100) = 80th item which falls in the class 30-40 , as the cumulative
frequency of this class is 77. Substituting the relevant values in the formula
𝑁
𝑖 100 − 𝑐𝑓𝑝
𝑃𝑖 = 𝐿 + (𝑤)
𝑓𝑝
100
80 100 − 77
𝑃80 = 40 + × (10) = 41.77
17
This value of P80 suggest that 80% of the company’s sales are Rs. 41.5 lakhs or less than
that and only 20% of the company’s sales figures are more than that.
Let’s try to summarise the difference between three major measures of central tendency
A distribution of data, in which the right half is mirror image of the left half is said to be
symmetrical. One example of symmetrical distribution is normal distribution or bell shaped
curve. In a symmetrical distribution, mean, median and mode all coincide at the same point. If
the distribution is skewed, the mean, median and mode are not equal. In a moderately skewed
distribution, the distance between mean and median is approximately one third of the distance
between the mean and the mode. This can be expressed as
1
𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛 = (𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒)
3
𝑀𝑜𝑑𝑒 = 3 𝑚𝑒𝑑𝑖𝑎𝑛 − 2 𝑀𝑜𝑑𝑒
Thus if we know values of any two measure of central tendency, the third measure can be
approximately determined in any moderately skewed distribution. The curves (a) and (c) are
examples of moderately skewed distribution. A skewed distribution can be of two types-
(1)Negatively skewed distribution (2) Positively skewed distribution.
A negatively skewed distribution is skewed to the left with a long left tail and a positively
skewed distribution is skewed to the right with a long right tail. It can be observed from the
above curves that the relationship between mean median and mode is as follows:
However, it can be observed that in any skewed distribution, the median lies between the mean
and the mode. When the population is skewed negatively or positively, the median is often the
best measure, as it is always between mean and mode. The median is not as highly influenced by
the frequency of occurrence of a single value as in the mode nor is it pulled by extreme values as
in the mean.
1. (b)
2.
Class Frequency Cumulative Frequency
0-1 1 1
1-2 4 5
2-3 8 13
3-4 7 20
4-5 3 23
5-6 2 25
N/2=25/2=12.5, which falls in the class 2-3 , as the cumulative frequency of this class is 13.
Substituting the relevant values in the formula
𝑁 25
− 𝑐𝑓𝑝 −5
2 2
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + (𝑤) = 2 + × (1) = 2.9375
𝑓𝑚𝑑 8
1. The modal class is 2-3, as the highest frequency is 10. Let’s substitute the values in the given
formula
𝑑1 = 𝑓1 − 𝑓0 = 10 − 6 = 4 𝑑2 = 𝑓1 − 𝑓2 = 10 − 9 = 1
𝑑1 4
𝑀𝑜𝑑𝑒 = 𝐿 + ( ) =2+( ) 1 = 2.8
𝑑1 + 𝑑2 4+1
Answers to check your Progress 4
1. Percentile
2. True
3. False
4. P30= 20 means 30% of the values are less than or equal to 20 and and 70 % are more than 20
1. (d)
2. (c)
2.10 Glossary
Arithmetic Mean: A measure of central tendency, computed by summing all the values and
dividing by the number of observations
Geometric Mean: A measure of central tendency used to measure the average rate of change or
growth for some quantity, computed by taking the nth root of the product of the values
representing change
Harmonic Mean:A measure of central tendency defined as the reciprocal of the arithmetic mean
of the reciprocals of the individual observations.
Median : The middle point of the data set the divides the data into two halves
Mode: The value most often repeated in the data set
Quartile: Fractiles that divide the data into four equal parts
Decile: Fractiles that divide the data into ten equal parts
PercentileFractiles that divide the data into hundred equal parts
2.11 Assignment
Frequency: 31 57 26 14 6 3
2.12 Activities
Compare some small cap, mid cap and large cap mutual funds for 3 year and five year return on
the basis of measures of central tendency.
3.1 Introduction
3.8 Glossary
3.9 Assignment
3.10 Activities
3.11 Case Study
• Identify the various situations where discrete probability distributions can be applied.
3.1 Іntroductіon
Many times organizations are more interested in some function of the outcome of a process/
experiment than the actual outcome itself. For example road safety service may be interested to
know the probability of a particular number of accidents that could take place in a day rather
than the details of the accident itself. We recognize that this information on probability will be
very useful in taking decision. Let’s say, a manufacturer randomly selects two boxes from a
large batch of boxes to test its quality. Each selected box can be rated as good or defective. If
the boxes are numbered 1 and 2, a defective box is designated as D and good box is designated
with G. Then all the possible outcomes in the sample space are {D1G2, D1D2, G1G2,G1D2} .The
expression D1G2 means first is defective and second is of good quality. The possible outcomes
are getting zero, one or two good boxes.It can be observed the probability of getting one good
(2/4) is more than getting both (1/4). This representation of possible outcomes and their
probabilities is known as probability distribution. Development of probability theory helps in
specifying probability distributions. There are a number of theoretical probability distributions
that have been analyzed. Many real life situations could be approximated to these distributions
and used for decision making. We will be studying some common probability distributions in
this and the subsequent unit. The objective of this unit is to study one type of probability
distribution- i.e. discrete probability distribution. The basic concept and its application in
decision making will be discussed
For example, the following data is the distribution of the number of loan approved per week at
the local branch office of a bank. The listing is collective exhaustive as all the possible
outcomes are listed and thus the probabilities must addupto 1.
X 0 1 2 3 4 5 6
P(x) 0.1 0.1 0.2 0.3 0.15 0.1 0.05
The given figure is a graphical representation of the data, with the values of the random variable
x shown on the horizontal axis. The probability that x takes on these values is shown on vertical
axis
0.35
0.3
0.25
0.2
P(x)
0.15
0.1
0.05
0
0 1 2 3 4 5 6
Loans per week
3.3.1Expected Value
After constructing the probability distribution for a random variable, we often want to calculate
the mean of the random variable. The mean µ of a probability distribution is the expected value
of a random variable. To calculate the expected value, you multiply each possible outcome x by
its corresponding probability P(x) and then add the resulting terms. The mathematical formula
for computing the expected value of a discrete random variable is
𝜇 = 𝐸(𝑥) = ∑ 𝑥𝑖 𝑃(𝑥𝑖 )
Let’s find the expected value for the given probability distribution on the loan approved per
week using the formula.
𝜇 = 𝐸(𝑥) = ∑ 𝑥𝑖 𝑃(𝑥𝑖 )
= 2.8
The expected value of 2.8 represents the mean number of loans approved per week. For
experiments that can be repeated numerous times, the expected value can be interpreted as the
‘long run’ average value of the random variable. However it does not mean that the random
variable will assume this value, whenever next the experiment is conducted. In fact, it is
impossible to approve exactly 2.8 loans in any week. This value is important to a manager from
both the planning and decision making point of view. For example the company is interested to
know, how many loans will be approved in the next five weeks? Although we cannot specify the
exact number of loans approved in a week, based on the expected value of 2.8 loans per week,
we can say that the average number of loans approved in the next month will be 14 (2.8x5). In
terms of setting targets or allocating work, the expected value may provide helpful decision
making information.
= √2.46 = 1.57
The variance of the number of loans approved per week is 2.46. For the purpose of easier
managerial interpretation, the standard deviation may be preferred over the variance, as it is
measured in the same units as the random variable. The variance (σ2) is measured in squared
units and is thus more difficult for a manger to interpret. The utility of variance and standard
deviation is limited to comparisons of variability of different random variables. For example, the
number of loans approved by two credit risk managers can be compared for variability.
There are many discrete probability distributions, but in this unit, we will be discussing two
types of discrete distribution- Binomial distribution and Poisson distribution.
Check Your Progress 2
1. The mean or the expected value of a discrete distribution is the long-run average of
the occurrences.( True/False)
2. To compute the variance of a discrete distribution, it is not necessary to know the
mean of the distribution.(True/ False)
3. You are offered an investment opportunity. Its outcomes and probabilities are
presented in the following table. B
X P(x)
-$1,000 .40
$0 .20
+$1,000 .40
The mean of this distribution is _____________.
a) -$400
b) $0
c) $200
d) $400
4. You are offered an investment opportunity. Its outcomes and probabilities are
presented in the following table. D
X P(x)
-$1,000 .40
$0 .20
+$1,000 .40
5.The standard deviation of this distribution is _____________.
a) -$400
b) $663
c) $800,000
d) $894
As the word binomial suggest, any single trial of a binomial experiment contains only two
possible outcomes. The two outcomes are labeled success or failure. The outcome of interest to
the researcher is usually labeled as success. The symbol ‘p’ represents the probability of success
of a trial and the symbol ‘q’ is the probability of failure of a trial. Let ‘x’ denote the value of the
random variable, then x can have a value of 0, 1, 2, 3…..n, depending on the number of success
observed in n trials. The mathematical formula for computing the probability of any value for the
random variable, where binomial distribution is applicable is:
𝑛!
𝑃(𝑥) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
where n= no of trials
x= no of successes desired
p= Probability getting success in one trial
q=1-p= the probability of failure in one trial
To illustrate the binomial probability distribution, let us consider the experiment of entering a toy
store. To keep the problem relatively small, we restrict the experiment to next five customers.
Based on experience, the store owner estimates that the probability of a customer making a
purchase is 0.30, what is the probability that exactly three of the next five customers make a
purchase?
Let’s check the assumptions of binomial experiment:
1. The experiment is described as sequence of five identical trials, one trial each for the
five customers entering the store
2. Each trial has only two possible outcomes- customer making a purchase( success) and
customer does not make a purchase(failure)
3. The purchase decision of one customer is independent of other trial is independent of
previous trials
4. Probabilities of purchase p=0.30 and no purchase q=0.70, remains constant throughout
the experiment
The random variable ‘x’ is defined as number of customers making a purchase. With n=5 trials,
p=0.30 , q=0.70, the probability that exactly 3 customers out of five make a purchase can be
computed using the formula:
𝑛!
𝑃(𝑥 = 3) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
5!
= 5 𝐶3 0.303 0.705−3 = × 0.303 0.705−3
3! (5 − 3)!
= 0.1323
Similarly, we can find the probability of zero( x=0) customers making a purchase
5!
𝑃(𝑥 = 0) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = 5 𝐶0 0.300 0.705−0 = × 0.300 0.705−0 = 0.1681
0! (5 − 0)!
If we are interested in computing the probability of at least 3 customers making a purchase, we
need to find the probabilities of P(x=0), P(x=1), P(x=2) and P( x=3) and then sum it up. In the
next section, we will be discussing the use of tables to directly get the probability values.
Binomial distributions are a family of distributions. Every different value of n and/or every
different value of p gives a different binomial distribution and tables are available for various
combinations of n and p values. Such a table for binomial probability values is provided in
Appendix Statistical Table A. In order to use this table, we need to specify values of n, p and x
for the binomial experiment. Each table is headed by a value of n. Eleven values of p are
presented in each table of size n. The column below each value of p is the binomial distribution
for that combination of n and p.
To illustrate the use of Binomial tables, let’s take an example. ABC resources, publishes data on
market share for various product categories in FMCG. As per the latest report, Oreo controls 10
% of the market of cookies brand. Suppose 20 purchasers are selected randomly from the
population. What is the probability that fewer than four purchasers choose Oreo?
For this problem n=20 p=0.10 and x=4. The portion of binomial tables under n=20 can be used
to find the probability values. Search along the p values for 0.10. Determining the probability of
getting x=4 involves adding the probabilities for x=0, 1, 2 and 3. The values appear in the x
column of the intersection of each x value and p=0.10.
x value Probability
0 0.122
1 0.270
2 0.285
3 0.190
∑= 0.867
P(x<4)=0.867. If 10 % of all cookie purchasers prefer Oreos and 20 cookies purchasers are
randomly selected, about 86.7 % of the time, fewer than four of the 20 will select Oreos
Let’s say, according to a study, 64% of all consumers believe that public sector banks are more
competitive than five years ago. If 25 consumers are selected randomly, what is the expected
number who believe that public sector banks are more competitive than they were five years
ago?
This problem can be described by the binomial distribution of n=25 and p=0.64. The mean of
this problem can b computed as:
𝜇 = 𝑛. 𝑝 = 25 × 0.64 = 16
It means in long run, if 25 consumers are selected randomly again and again and if 64% of the
consumers believe in the given statement, then on an average 16 out of 25 will believe that
public sector banks are more competitive than five years ago.
The standard deviation of the binomial distribution is denoted as ‘σ’ and is computed using the
following formula:
𝜎 = √𝑛. 𝑝. 𝑞
For the given data, the standard deviation is
𝜎 = √25 × 0.64 × 0.36 = 2.4
Check Your Progress 3
1. The distribution that deals only in success and failures is referred to as the ________
2. If x is a binomial random variable with n=8 and p=0.6, the mean value of x is _____
a) 6
b) 4.8
c) 3.2
d) 8
3. If x is a binomial random variable with n=8 and p=0.6, the standard deviation of x is
a) 4.8
b) 3.2
c) 1.92
d) 1.39
4. If x is a binomial random variable with n=8 and p=0.6, what is the probability that x
is equal to 4?
a) 0.500
b) 0.005
c) 0.124
d) 0.232
3.5Poisson Distribution
The binomial distributiondescribes a distribution of two possible outcomes from a given
number of trials. The Poisson distribution focuses only on the number of discrete occurrences
over some interval. A Poisson experiment does not have a given number of trials (n) as the
Binomial distribution. The Poisson distribution has the following characteristics:
• It describes discrete occurrences over a continuum
• Each occurrence is independent of the other occurrences
• The occurrence in each interval can range from zero to infinity
• The expected number of occurrences must remain constant throughout the experiment
This distribution was used initially to describe occurrence of rare events for some interval .
Some of the common examples where Poisson random variable can be used to define are-
number of accidents per day, number of earthquakes occurring over a time period, no of
misprints on a page, number of interruptions per minute on a server, number of arrivals at a
tollbooth etc.
If a Poisson distributed phenomenon is studies for a long period of time, a long run average can
be determined. This average is denoted as lambda(ʎ)andis used to describe Poisson
Distribution. The Poisson formula used to compute the probability of occurrences over an
interval for a given lambda value is:
𝜆𝑥 𝑒 −𝜆
𝑃(𝑥) =
𝑥!
Where, x=0, 1, 2, 3……..
ʎ= long run average
e=2.7182
Her x is the number of occurrences per interval for which the probability is to be computed. The
ʎ value must remain constant throughout the Poisson experiment.
Suppose that we are interested in the number of arrivals at a bank window during a 10 minute
period on weekday mornings. We assume that the arrival of one customer is independent of
arrival of the other. Based on the historical data it is found that the average number of
customers arriving during a 10 minute interval of time is 8. If we want to find the probability of
arrival of five customers in 10 minutes, we would use x=5, ʎ= 8 per 10 minutes and compute:
𝜆𝑥 𝑒 −𝜆 85 × 2.71828−8
𝑃(𝑥) = = = 0.0916
𝑥! 5!
Suppose we want to find the probability for 9 customers arriving in twenty minutes. We need to
note that there is a change in the interval, instead of 10 minutes, the probability is to be found
for 20 minutes. As per the ʎ value, on an average 8 customers are arriving in 10 minutes. We
can derive the new average rate for 20 Minutes by multiplying ʎ by 2, i.e 16 customers per 20
minutes. To compute we would usex=9, ʎ= 16 per 20 minutes:
𝜆𝑥 𝑒 −𝜆 169 × 2.71828−16
𝑃(𝑥) = = = 0.0213
𝑥! 9!
The probability of 9 customers arriving in twenty minutes duration is 0.0213. Similarly, if we
want to find probability of a x value for 5minutes, the lambda value will be 4 customer per five
minutes. If we want to find cumulative probabilities like less than 8 customers, we need to find
various probabilities (for x=0, 1, 2, 3,4,5,6,7,) and then add it up. However in this case, it will
be easier to use Poisson tables.
Every value of Lambda determines different Poisson distribution. Regardless of the nature of
interval associated with the lambda, The Poisson distribution for a particular lambda is same.
Table B, contains the Poisson distribution for the selected value of lambda. Probabilities for
each x value associated with a given lambda are displayed, if it has a nonzero probability value
in the table.
Let’s illustrate the use of Poisson table for the given problem. The number of faults per month
that arise in the gearboxes of travel buses is known to follow a Poisson distribution with a mean
of 2.5 faults per month. What is the probability that in a given month less than 3 faults are
found?
For this problem ʎ=2.5 faults per month and x=0, 1, 2. The portion of Poisson tables under
ʎ=2.5 can be used to find the probability values. The values appear in the x column of the
intersection of each x value and ʎ=2.5.
x value Probability
0 0.0821
1 0.2052
2 0.2565
∑= 0.5438
The probability that in a given month less than three faults are found is 0.5438
3.5.2 Mean and Standard deviation of Poisson Distribution
The mean or expected value of a Poisson distribution is ʎ. It is the long-run average of
occurrences over a interval if many samples are taken over time. Lambda is usually not a whole
number, so most of the time it is impossible to actually observe lambda occurrence in an interval.
For example lambda is 4.5 /interval for a Poisson distribution. Let’s say, a random sample of 20
resulted in the given x occurrences for the time interval:
3, 4, 7, 6, 5, 4, 3, 4, 5, 6, 4, 5, 3, 4, 5, 6, 7, 5, 3, 4
Computing the average for this data gives 4.65, however for infinite sampling the long run
average ʎ is 4.5/interval. We can note that when ʎ is 4.5, most of the values lie between 4 and 5
and rarely there will be values like 1,2, 10,11……. Thus understanding the mean of the Poisson
distribution gives a feel for the actual occurrences that are likely to happen.
3.6Let Us Sum Up
Probability experiments produce random occurrence. A variable that contains the outcomes of a
random experiment is called a random variable. A random variable that may assume only a finite
or count ably infinite number of possible numbers is a discrete random variable. Random
variables that may assume any value over a given interval are called continuous random variable.
Discrete distributions are constructed from discrete random variable. Continuous distributions
are constructed from continuous random variable.We have looked into situations which give rise
to discrete distributions and how it can be helpful in decision making. We have discussed two
types of discrete distributions- Binomial and Poisson distribution. The concept of expected value
and standard deviation is discussed with its interpretation. The binomial distribution fits
experiments when only two outcomes are possible. The Poisson distribution pertains to
occurrences over some interval.The assumptions are that each occurrence is independent of other
occurrences and that the value of lambda remains constant over the period of time.
3.7Answers for Check Your Progress
1. Binomial distribution
2. (b)
3. (d)
4. (d)
1. ʎ
2. 𝜎 = √𝜆
3. (c)
4. P(x>=3), ʎ= 7 cars per two hours, so for one hour interval ʎ= 3.5 cars.
P(x>=3)= 1- {P(x=0)+P(x=1) +P(x=2)
x value Probability
0 0.0302
1 0.1057
2 0.1850
∑= 0.3209
P(x>=3)= 1- {0.0302+0.1057 +0.1850)= 1- 0.3209= 0.6791
3.8Glossary
Probability distribution:A list of outcomes of an experiment with the probabilities associated
with these outcomes
Random Variable: A variable that takes on different values as a result of outcomes of a
random experiment.
Discrete random variable: A random variable that is allowed to take countable infinite or
finite number of values.
Continuous random variable: A probability distribution in which the variable is allowed to
take on any value within a given range.
Discrete probability distribution: A probability distribution of discrete random variable is
called discrete probability distribution.
Continuous probability distribution: A probability distribution of continuous random variable
is known as continuous probability distribution
Expected value: A weighted average of the outcomes of an experiment.
Binomial distribution: The probability distribution for a discrete probability distribution, used
to compute the probability of x success in n trials
Poisson distribution: The probability distribution for a discrete probability distribution, used to
compute the probability of xoccurrences over a specified interval.
3.9Assignment
1. What is meaning of expected value of a probability distribution?
2. What are the assumptions of a Binomial distribution?
3. What are the characteristics of a Poisson distribution?
4. A survey conducted for an insurance company revealed that 70% of workers say job stress
caused frequent health problems. Suppose a random sample 10 workers is selected. What is
the probability that more than seven of them say job stress caused frequent health problems?
What is the expected number of workers who say job stress caused frequent health
problems?
5. A survey conducted by the Consumer research centre reported that among other things that
women spend an average 1.2 hours per week on shopping online. Assume that hours per
week shopping online are Poisson distributed. If the survey result is true for all women and
if a woman is randomly selected, what is the probability that she did not shop at all online
over a one week period? What is the probability that a women would shop three or more
hours online during a one week period?
3.10 Activities
Develop graphs for binomial distribution using the tables for n= 8 and(a) p=0.20, (b) p=0.50
and (c) p=0.80 and comment on the shape of the three graphs
Starting a business entails understanding and dealing with many issues—legal, financing, sales
and marketing, intellectual property protection, liability protection, human resources, and
more. The interest in entrepreneurship is at an all-time high. And there have been spectacular
success stories of early stage startups growing to be multi-billion-dollar companies, such as
Uber, Facebook, WhatsApp, Airbnb, and many others.Starting a business is a huge commitment.
Entrepreneurs often fail to appreciate the significant amount of time, resources, and energy needed to
start and grow a business.
A survey was done to identify the most important advice for starting a business venture. A
random sample of 12 small business owners, are contacted and data was collected. As per the
survey, 20 % of all small business owners say the most important advice for starting a business is
to prepare for long hours and hard work. Twenty five percent say the most important advice is to
have good financing ready. Nineteen percent say having a good plan is the most important
advice, 18 % say studying the industry and industry knowledge is the most important advice and
18% list other advice.
Questions
1. What is the probability that six or more owners would say preparing for long hours and
hard work is the most important advice?
2. What is the probability that exactly five owners would say having food financing ready is
the most important advice?
3. What is the expected number of owners who would say having a good plan is the most
important advice?
4.9 Assignment
4.10 Activities
4.1 Introduction
In the last unit, we discussed situations involving discrete random variable and the resulting
discrete probability distributions. In this unit we will be focusing on random variable which can
take any value over a range. Suppose you are a website designer for a matrimonial site and you
have to make sure that the webpage downloads quickly. The download time is affected by design
of the website and the load on the company’s web server. The random variable ‘download time’
is a continuous variable, as it can take any value on a range and not just whole number. This type
of random variable which can take infinite number of values over a range is called a continuous
random variable and the probability distribution of such variable is called continuous probability
distribution. The concepts and assumptions for this type of distributions is quite different from
those of discrete probability distributions. The objective of this unit is to study the concepts and
usefulness of continuous distribution. We will be discussing some important continuous
probability distributions and their applications in this unit.
The figure 1 graphically represents three continuous distributions. Figure 1(a) depicts a uniform
distribution, where the probability of occurrence of a value is equally likely to occur anywhere in
the range between the smallest vale ‘a’ and the largest value ‘b’. Sometimes referred to as
uniform distribution, the uniform distribution is symmetric, meaning its mean equals its median.
Figure 1 (a) Uniform Distribution (b) Normal Distribution (c) Exponential Distribution
Figure 1(b) depicts a normal distribution. The normal distribution is symmetrical and bell
shaped, so most of the values group around the mean. The mean median and mode all have the
same value. An exponential distribution is illustrated in Figure 1(c).An exponential distribution,
is a positively skewed distribution, which makes the mean larger than the median. The range for
an exponential distribution is zero to positive infinity, but its shape makes it highly unlikely for
extremely large values to occur.
4.3Uniform Distribution
Uniform distribution refers to a probability distribution for which all of the values that a random
variable can take on occur with equal probability in the range between the smallest value ‘a’ and
the largest value ‘b’. Suppose, the travel time of buses travelling from city X to city Y is denoted
by x. Assume that the minimum time is 3 hours and the maximum time is 3 hours 20 minutes.
Thus in terms of minutes the travel time can be any interval between 180 and 200 minutes. As
the random variable x can take any value between 180 and 200 minutes, x is a continuous
variable. Based on the past data, the probability of flight time between 180 and 181 minutes is
same as the probability of travel time between any other 1 minute interval up to and including
200 minutes. With every interval being equally likely, the random variable x has a uniform
distribution. The following probability density function defines a uniform distribution:
1
𝑓𝑜𝑟 𝑎 ≤ 𝑥 ≤ 𝑏
𝑓(𝑥) = {𝑏 − 𝑎 }
0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑜𝑡ℎ𝑒𝑟 𝑣𝑎𝑙𝑢𝑒𝑠
In a uniform distribution, the total area under the curve is 1 and as the shape is rectangular the
area can be computed as the product of length and width of the rectangle. Because, by definition,
the distribution lies between the x values of a and b, the length of the rectangle is (b-a).
Combining this with the fact that area under the curve is equal to 1, height of the rectangle can be
solved as follows:
Area of Rectangle= Length x height=1, but length= (b-a)
Therefore (𝑏 − 𝑎) × ℎ𝑒𝑖𝑔ℎ𝑡 = 1
1
𝐻𝑒𝑖𝑔ℎ𝑡 =
(𝑏 − 𝑎)
The mean and the standard deviation of the uniform distribution are given as follows:
𝑎+𝑏
𝜇=
2
𝑏−𝑎
𝜎=
√12
As an example, suppose a production line manufactures a machine part in lots of 10 per minute
during a shift. When the lots are weighed, variation in weights was observed in the range of 34 to
48 grams in a uniform distribution. The height of the distribution is:
1 1 1
𝐻𝑒𝑖𝑔ℎ𝑡 = = =
(𝑏 − 𝑎) 48 − 34 14
The mean and the standard deviation of the uniform distribution are given as follows:
𝑎 + 𝑏 48 + 34 82
𝜇= = = = 41
2 2 2
𝑏 − 𝑎 48 − 34 14
𝜎= = = = 4.041
√12 √12 3.464
where, 𝑎 ≤ 𝑥1 ≤ 𝑥2 ≤ 𝑏
Suppose for the same problem given above, we are interested to find the probability that the lot
weighs between 40 and 45 grams. The probability can be calculated as:
𝑥2 − 𝑥1 45 − 40
𝑃(𝑥) = = = 0.3571
𝑏−𝑎 48 − 34
So the probability that the lot weights between 40 and 45 grams is 0.3571. The probability that
the lot weight is less than 34 is zero, as the lowest value is 34. Similarly the probability that the
lot weight is more than 50 is also zero, as the upper value is 48.
Let’s find the probability that the lot weighs less than 40. As the lowest value is 34, for finding
the probability that lot weighs being less than 40 actually means values between 34 and 40grams.
So the probability is calculated as follows:
𝑥2 − 𝑥1 40 − 34
𝑃(𝑥) = = = 0.4286
𝑏−𝑎 48 − 34
A very important continuous probability distribution is the normal distribution. There are many
reasons for normal distribution’s versatility and prominent place in statistics. First, it has
properties that make it applicable to many situations in which it is necessary to make inferences
by taking samples. Quite often, we face the problem of limited data for making inferences about
processes. Irrespective of the shape of the distribution of population, it has been found that
normal distribution can be used to characterize sampling distributions. This helps considerably in
inferential statistics. Second, the normal distribution is similar to actual frequency distribution of
many phenomena, like human characteristics (weight, height, IQ), outputs from physical
processes (dimensions and yield) and other measures of interest to managers. This knowledge
helps us to calculate probabilities of different events in varied situations and which in turn help
us in decision making. Finally, the normal distribution can be used to approximate certain
probability distributions, which helps considerably in simplifying probability calculations.
The normal distribution is described by two parameters: the mean π and standard deviation σ.
The density function of the normal distribution is:
1 𝑥−𝜇 2
1 − ( )
𝑓(𝑥) = 𝑒 2 𝜎
𝜎 √2𝜋
Where µ= mean of x
Σ= standard deviation of x
π=3.14159
e= 2.71828
Using calculus to determine areas under the normal curve from this function is difficult and time
consuming, therefore all researchers use table values to analyse normal distribution problems
Every unique pair of µ and σ values defines a different normal distribution. This characteristic of
being a family of curves could make analysis tedious, because of the volumes of normal curve
tables – one for each combination of µ and σ would be required. A mechanism was developed by
which all normal distributions can be converted into a single distribution (z distribution).This
process yields standardized normal distribution. The conversion formula for any value of x of a
given normal distribution is as follows:
𝑥−𝜇
𝑧= 𝑤ℎ𝑒𝑟𝑒 𝜎 ≠ 0
𝜎
A zscore is the number of standard deviations that a value, x, is above or below the mean. If the
value of x is less than the mean, the zscore is negative;If the value of x is more than the mean,
the zscore is positive;and if the value of x is equal to mean, the zscore is zero. This formula
converts the distance from mean into standard deviation units. A standard zdistribution table can
be used to find probabilities for any normal curve value that is converted to zscore.
Thezdistributionis a normal distribution with a mean of 0 and standard deviation of 1. Any
value of x at the mean is zero standard deviation from the mean. Any value of x that is one
standard deviation above or below the mean has a z value of 1. As per the empirical rule, in a
normal distribution regardless of the values of µ and σ, 68 % of all values are within one
standard deviation of the mean; 95%of all values are within one standard deviation of the mean;
and 99.7of all values are within three standard deviation of the mean. The z distribution
probability values are given in Appendix Statistical Table C. The Table C gives the total area
between 0 and any point on the positive z axis. Since the curve is symmetric, the area between z
and 0 is the same, irrespective of whether z is positive or negative. The table areas or
probabilities are always positive.
To use Z Table to find probabilities, first note that values of z appear in the left hand column,
with the second decimal value of z appearing in the top row.. For example for a value of 1.00, we
find the 1.0 in the left hand column and 0.00 in the top row. Then by looking into the body of the
table, we find that 0.3413 correspond to the 1.00 value of z. The value of 0.3413 is the area under
the curve between the mean (z=0) and z=1.00, as shown graphically in Figure 2.
Figure 2
Area or probability of 0.3413
z=0 z=+1
Suppose we want to find the probability of obtaining a z value between z=-1.00 and z=1.00. We
already know that the probability value of a z value between z=0.00 and z=1.00 is 0.3413. As the
normal distribution is symmetrical, i.e. the shape of the curve on the left of the mean is a mirror
image of the shape of the curve on the right of the mean Thus the probability of a z value
between z=0.00 and z=-1.00 is same as that probability of a z value between z=0.00 and z=1.00,
i.e 0.3413. Hence the probability between z=-1 and z=1.00 is 0.3413 + 0.3413= 0.6826, as shown
graphically.
Figure 3
Area or probability of 0.6826
z=-1 z=+1
4.4.3 Solving Normal distribution problems
Suppose that the Ceattyre company just developed a new radial tire that will be sold through a
national chain of stores. Because the tyre is a new product, the management believes that the
mileage guarantee offered with the tyre will be an important factor in the consumer acceptance
of the product. Before finalizing the tyre’s mileage guarantee policy, Ceat management wants
some probability information concerning the number of miles the tires will last.
From actual road test with the tires, the engineering department estimates the mean tyre mileage
to be 36500 miles and the standard deviation to be 5000 miles. In addition the data collected
indicate that a normal distribution is a reasonable assumption. What percentage of the tires can
be expected to last more than 40000 miles?
𝑥 − 𝜇 40000 − 36500
𝑧= = = 0.70
𝜎 5000
Probability that x exceeds 40000
σ=5000
µ= 36500 40000
Thus the probability that the normal distribution for tyre mileage will have x values greater than
40000 is the same as the probability that the z distribution will have a z value greater than 0.70.
Using Z Table, we find that the area corresponding to z=0.70 is 0.2580. But we need to
remember that the table provides area between the mean and the z value. Thus we know, that
there is a 0.2580 area between mean and z=0.70. The total are under the curve is 1, being a
symmetrical curve, the area from mean to the tail will be 0.5.Thus the area above z=0.70 will be
0.5-0.2580=0.2420. In terms of tyre mileage x, we can conclude that there is a 0.2420 probability
that x value will be above 40000. Thus about 24.2 % of the tires manufactured can b expected to
last more than 40000.
Let us now assume that the company is considering that it will provide a discount on new set of
tires if the mileage on the original tires does not exceed the mileage stated on the guarantee.
What should be the guarantee mileage be, if Ceat wants no more than 8% of the tires to be
eligible for the discount?
8 % of the tires
σ =5000
X value? µ=36500
Note that 8 % of the area is below that unknown guarantee mileage that we need to calculate. It
means the area between the men and the unknown guarantee value is 0.5-0.08=0.42. The
question is how many standard deviation(z value) do we have to be below the mean to get 42 %
of area? We have earlier used the z Table to find the area using a z value. Now we have area
between the mean and the z value, and need to find the corresponding z value. If we look for 0.42
in the body of the z Table, we see that a 0.4200 area occurs at approximately z = 1.41. As the
area is below the mean the z value of interest must be -1.41. Hence the desired guarantee mileage
should be 1.41 standard deviations less than the mean. Putting the known values in the formula,
𝑥−𝜇
𝑧=
𝜎
𝑥 − 36500
−1.41 =
5000
Therefore a guarantee of 29450 miles will meet the requirement that approximately 8 % of the
tires will be eligible for the discount. With this information the firm might confidently take the
decision to set its tyre mileage at 29000 miles. Again we see the important role of probability
distributions in providing information for decision making.
4.4.4. Normal as an approximation of Binomial
As the sample sizes become large, binomial distribution approaches normal distribution,
regardless of the value of p. This phenomenon occurs faster (for smaller values of n) when p is
near 0.50.To work a binomial problem by the normal curve requires a transformation process.
The first part is to convert the two parameters of binomial distribution-n and p, to the two
parameters of the normal distribution, µ and σ. It involves following formula:
𝜇 = 𝑛. 𝑝 𝑎𝑛𝑑 𝜎 = √𝑛. 𝑝. 𝑞
Suppose we want to find the probability that random variable x value lie between 20 and 24,
when a sample of 60 is taken and the probability of success is found to be 0.60. From the
previous unit we know that this can be calculated using the formula:
𝑛!
𝑃(𝑥) = 𝑛𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 = × 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
We need to calculate P(x) for x=20, 21, 22, 23, and 24 and then sum it up to get the probability,
which is going to be very tedious. Translating from a binomial problem to a normal curve
problem gives:
𝜇 = 𝑛. 𝑝 = 60(0.30) = 18 𝑎𝑛𝑑 𝜎 = √𝑛. 𝑝. 𝑞 = 3.55
σ =3.55
µ=1819.50 24.50
For x=19.50
𝑥 − 𝜇 19.50 − 18
𝑧= = = 0.43
𝜎 3.55
For x=25.50
𝑥 − 𝜇 24.50 − 18
𝑧= = = 1.83
𝜎 3.55
From the z Table we can find that, probability of z value =0.43 is 0.1664. This value is the area
between the mean and the z value. Similarly for z=1.83, the p value is 0.4664. To find the area
required probability, we need to subtract the two probability values:
Thus the probability that the value will fall between 19.5 and 24.50 is 030. You may check the
value by using the binomial distribution formula; the answer will be the same.
Figure 4
𝐹(𝑥) = 𝜆𝑒 −𝜆𝑥
𝑤ℎ𝑒𝑟𝑒 𝑥 ≥ 0 , 𝜆 > 0 𝑎𝑛𝑑 𝑒 = 2.71828
The mean of an exponential distribution is 𝜇 = 1⁄𝜆 and the standard deviation is 𝜎 = 1⁄𝜆
Probabilities are computed by determining the area under the curve between two points.
Applying calculus to the exponential probability density function gives a formula that can be
used to compute the probabilities of exponential distribution:
𝑃(𝑥 ≥ 𝑥0 ) = 𝑒 −𝜆𝑥0
Where, x0≥ 0, and is the fraction or the number of intervals between arrivals in the probability
question .
The inter-arrival times of random variable is exponentially distributed. The mean of exponential
1
distribution can be calculated as- 𝜇 = 1⁄𝜆 = 1.5 = 0.667 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 𝑜𝑟 40 𝑠𝑒𝑐𝑜𝑛𝑑𝑠. It means on
an average it will take 40 seconds between arrivals of two consecutive customers. The
probability of an interval of 2 or more minutes can be calculated as follows:
Illustration: The exponential distribution can be used to solve Poisson type problems in which
the intervals are not time. The Air travel consumer report published that average number of
mishandled baggage occurrences is 4.06 per 1000 passengers .Assume mishandled baggage
occurrences is Poisson distributed. Determine the average number of passengers between
occurrences. Suppose a baggage is just been mishandled; what is the probability that the number
will be fewer than 190 passengers? What is the probability that it is between 190 and 495
passengers?
As the = 4.06/ 1000 passengers; the mean of exponential distribution can be calculated as.
1
𝜇 = 1⁄𝜆 = = 0.2463
4.06
= 0.2463(1000) = 246.3
The formula for computing probability of exponential distribution is for x ≥ x0 value, however
we want to find the probability for fewer than 190 passengers in this problem. This can be
solved as:
x0 = 190/1,000 passengers = .19
P(x≥ 0.19)=e-x = e-4.06(.19) = e-.7714 = 0.4624
As the total area under the curve is 1, P(x< 190) 1 - .4624 = .5376
To find the probability between 190 and 495, let’s show the problem graphically :
190 495
𝑃(𝑥 ≥ 495) = 𝑒 −4.06(0.495) = 𝑒 −2.0097 = 0.1340
𝑊𝑒 ℎ𝑎𝑣𝑒 𝑎𝑙𝑟𝑒𝑎𝑑𝑦 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 𝑃(𝑥 ≥ 190) = 0.4624
From looking at the graph, we can easily see that the required shaded are can be computed by
subtracting P(x≥495) from P(x≥190)= 0.4624-0.1340= 0.3284
In operations research, Poisson distribution in conjunction with exponential distribution is used
to solve queuing problems. The Poisson distribution is used to analyse the arrivals in a queue and
exponential distribution is used to analyse inter-arrival time.
The most widely used distribution is the normal distribution. Many phenomena are normally
distributed like characteristics of machine parts, many measurements of natural environment,
human characteristics such as height, weight, IQ and test scores. The parameter necessary to
describe a normal distribution is mean and standard deviation. For convenience, the data should
be standardized by using the mean and standard deviation to compute z score. The probability of
the z score of an x value can be determined by the table of z scores. The normal distribution is
also used to work certain type of binomial distribution problems.
1. True
2. (a)
1. True
2. (a)
3. (c)
4. (d)
5. (b)
1.True
2. False
3.True
4. (d)
5. (c)
6. (d)
1. False
2. False
3. (c)
4. (d)
5. (b)
4.8 Glossary
Uniform Probability Distribution: A continuous probability distribution in which the
probability that the random variable will assume a value in any interval of equal length is same
for each interval.
Probability Density function: The function that describes the probability distribution of a
continuous random variable
Normal Distribution: A continuous probability distribution whose probability density function
is bell shaped and is determined by the mean and standard deviation
Standard normal distribution: A normal distribution with mean of 0 and a standard deviation
of 1
Z Score: z score is the distance that an x value is from the mean µ in units of standard deviations
Exponential Distribution: A continuous probability distribution that is useful in describing the
time to complete a task or the time/interval between occurrences of an event
4.9 Assignment
4.10 Activities
Use the probability density formula to sketch the graphs of the following exponential
distributions (a) ʎ=0.2, (b)ʎ=0.4, (c) ʎ=0.4. Hint{ use x=0, 1,2, 3……and find f(x)}
In this block, we studied how quantitative methods may be used to help managers make better
decisions. In the first unit the meaning and use of various quantitative analysis methods in the
field of business and management was explained. In this unit, the basic difference between
statistics and operations research was discussed along with their techniques. In the second unit,
the concept of measures of central tendency was introduced. Various measures of central
tendency and its relative importance were discussed. In the third unit the application of various
types of discrete probability distributions were discussed. In the last unit continuous probability
distributions and its various applications were covered.
.
Block Assignment
5. The Poisson distribution of annual trips per family to amusement parks gives average of
0.6 trips per year. What is the probability of randomly selected family did not make a trip
to an amusement park last year? What is the probability of randomly selected family took
three or fewer trips to amusement parks over a three years period?
Block Structure
In this block, we will study decision making techniques which are used to make business
decisions and forecasting. In thefirst unit the concept of decision making along with decision tree
approach and other related concepts like single stage decisions, multi stage decisions, issues, and
types of environments of decisions will be discussed. In the second unit,we will explore
relationships between variables through correlation and regression analysis and learn how to
develop models that can be used to predict one variable by another variable.Here, we will also
learn to make meaningful predictions from the given data by fitting them into the linear function.
In the third unit some of the basic concepts of forecasting will be discussed for planning and
understanding decisions in a scientific approach. We will also explore the statistical techniques
that can be used to forecast values from time-series data and to know how well the forecast is
being done.
Objectives
After learning this block, you will be able to:
• Understand decision problems which involve various uncertainties in different types of
environments
• Understand the decision-making process
• Analyze problems using decision tree Approach
• Make decisions under uncertainty
• Analyze situations where probabilities of outcomes are uncertain
• Understand the concept of correlation
• Understand the role of regression in establishing mathematical relationships
betweendependent and independent variables from given data
• Use the least squares criterion to estimate the model parameters
• Learn the meaning and calculation of residuals
• Identify the standard errors of estimate
Block Structure
Unit 3: Forecasting
Unit No. 1 Decision Theory
______________________________________________
Unit Structure
1.0 Learning Objectives
1.1 Introduction
1.1.1 Types of decision-making environments
Check your progress 1
1.6 Glossary
1.7 Assignment
1.8 Activities
1.1 Introduction
At every stage of our life including day to day routine involves various kinds of decisions. The
decision problems are everywhere but altogether it deals with making good decisions too. Many
people from different time and fields, use decision theory under different environments to come
up with the final decisions. The analysis varies with the nature of the decision problem, so that
any classification base for decision problems provides us with a means to segregate the decision
analysis approach. An important condition for the existence of a decision problem is the presence
of alternative ways of actions. Each action leads to a consequence through a possible set of
outcomes based on the information might be known or unknown. One of the several ways of
classifying decision problems has been based on this knowledge about the information on
outcomes. Broadly, two classifications result:
a) The information on outcomes are deterministic and are known with certainty, and
b) The information on outcomes are probabilistic (uncertain), with the probabilities known or
unknown.
The former is classified as decision making under certainty, while the latter is called decision
making under uncertainty. The theory that has resulted from analyzing decision problems in
uncertain situations is commonly known as Decision Theory. The agenda of this unit is to study
some methods for solving decision problems under uncertainty. Decision theory is an analytic
and systematic approach for decision making. A good decision is one that is based on logic,
considers all available data and possible alternatives, and the quantitative approach described
here.
Type 1: Decision making under certainty: The decision maker knows with certainty the
consequences of every alternative or decision choice.
Type 2: Decision making under uncertainty: The decision maker does not know the probabilities
of the various outcomes.
Type 3: Decision making under risk: The decision maker knows the probabilities of the various
outcomes.
Check your progress 1
1. The information on outcomes are deterministic and are known with certainty is known
as____________
2. The necessary condition for the existence of decision problem is the presence
of___________
4. Which theory concerns making sound decisions under conditions of certainty, risk and
uncertainty
a. Game Theory
b. Network Analysis
c. Decision Theory
d. None of the above
Different problems arise while analyzing decision problems under uncertain conditions
ofoutcomes. The first concept is, decisions can be viewed either as independent decisions (one
stage/one-time decision) or as decisions with the sequence of decisions that are taken over a
period of time. So, planning horizon is also the nature of decisions, we have either a single stage
decision problem, or a sequential decision problem. In real life, the decisions can be classified
generally as sequential and thus it becomes difficult to solve. Fortunately, valid assumptions in
most of the cases help to reduce the number of stages, and make the problem solvable. So,
decision theory deals with following two types problems basically.
(a) One-stage decision making process
(b) Multi-stage decision making process
Now consider the problem was to find the number of magazines copies one should stock inthe
face of uncertain demand, such that, the expected profit is maximize. A critical evaluation of the
method shows that the calculation becomes tedious as the number of values the demand is taking
increases. You can also try the method with a discrete distribution of demand, where demand can
take values between some range and then do trial and error for each and every value of demand
that is again time-consuming task. So. it calls for the separate techniques to make decisions. we
will learn techniques for solving such single stage problems called marginal analysis. For
sequential decision problems, the Decision Tree Approach is helpful and will be explained in a
later section.
In the analysis, we will be using some criteria but main is expected monetary value criteria (all
other criteria will be explained in next section). However, this criterion suffers from two
problems. Expected Profit or Expected Monetary Value (EMV), as it is more commonly known,
does not take into account the decision maker's attitude towards risk. The other problem with
Expected Monetary Value is that it can be applied only when the probabilities of outcomes are
known. For problems, where the probabilities are unknown, one way is to assign equal
probabilities to the outcomes, and then use EMV for decision-making. However, this is also not
always rational, and as other criteria are available for deciding on such situations.
The following are the steps of decision-making process which can be commonly used for
any approach:
State of Nature
Alternative Favourable Market Unfavourable Market
Construct a large plant 200,000 -180,000
Construct a small plant 100,000 -20,000
Do nothing 0 0
1. Maximax (optimistic): Used to find the alternative that maximizes the maximum payoff.
Locate the maximum payoff for each alternative.Select the alternative with the maximum
number.
State of Nature
Alternative Favourable Market Unfavourable Market Maximum in a row
Construct a large plant 200,000 -180,000 200,000
Construct a small plant 100,000 -20,000 100,000
Do nothing 0 0 0
2. Maximin (pessimistic): Used to find the alternative that maximizes the minimum payoff.
Locate the minimum payoff for each alternative.Select the alternative with the minimum number.
State of Nature
Alternative Favourable Market Unfavourable Market Minimum in a row
Construct a large plant 200,000 -180,000 200,000
Construct a small plant 100,000 -20,000 100,000
Do nothing 0 0 0
State of Nature
Alternative Favourable Unfavourable Market Maximum = Realism
Market
Construct a large plant 200,000 -180,000 1,24,000
Construct a small plant 100,000 -20,000 76,000
Do nothing 0 0 0
4. Equally likely (Laplace): Considers all the payoffs for each alternative with highest average.
Find the average payoff for each alternative.Select the alternative with the highest average.
State of Nature
Alternative Favourable Market Unfavourable Market Highest Average
Construct a large plant 200,000 -180,000 10,000
Construct a small plant 100,000 -20,000 40,000
Do nothing 0 0 0
This is decision making when there are several possible states of nature, and the probabilities
associated with each possible state are known.The most popular method is to choose the
alternative with the highest expected monetary value (EMV).
Any problem that can be presented in a decision table can also be graphically represented in a
decision tree.Decision trees are most beneficial when a sequence of decisions must be made. All
decision trees contain decision points or nodes, from which one of several alternatives may be
chosen.All decision trees contain state-of-nature points or nodes, out of which one state of nature
will occur.
Structure of decision-tree
Decision Theory provides us with the structure and methods for analyzing decision problems
under uncertainty, certainty and risk. The decision problems under uncertainty are characterized
by different courses of action and uncertain or risky outcomes corresponding to each action or
alternative. The problems can involve a single stage or a multi-stage decision process. Expected
monetary value and other different criterions is helpful in solving single stage problems, whereas
the decision tree approach is useful for solving multi-stage problems. In this unit we have learned
the applications of these methods to solve decision problems. The main objective behind using
decision making methods is of maximizing the Expected Monetary Value (EMV). So ultimate
goal by finding EMV with both the methods is basically assumes that the decision maker does
not want to take risk or he/she wants to be neutral or decision maker can make approximate
decisions based on the outcomes discovered.
4. C
2. EMV
3. Attitude
4. Outcomes
Answers to Check your progress 3
1. 0 to 1
2. Laplace
1.6 Glossary
Decision making under certainty: The decision maker knows with certainty the consequences
of every alternative or decision choice.
Decision making under uncertainty: The decision maker does not know the probabilities of the
various outcomes.
Decision making under risk: The decision maker knows the probabilities of the various
outcomes.
Maximax (optimistic): Used to find the alternative that maximizes the maximum payoff.
Maximin (pessimistic): Used to find the alternative that maximizes the minimum payoff.
Equally likely (Laplace): Considers all the payoffs for each alternative with highest average.
EMV: The highest expected monetary value means payoff of particular decision multiply by
probability of occurrence.
Lines or branches in decision tree: It connects the decisions nodes and the states of nature.
1.7 Assignment
1. A small group of investors is considering planting a tree farm. Their choices are (1) don’t plan
trees, (2) plant a small number of trees, or (3) plant a large number of trees. The investors are
concerned about the demand for trees. If demand for trees declines, planting a large tree farm
would probably result in a loss. However, if a large increase in the demand for trees occurs, not
planting a tree farm could mean a large loss in revenue opportunity. They determine that three
states of demand are possible: (1) demand declines, (2) demand remains the same as it is, and (3)
demand increases. Use the following decision table to compute an expected monetary value for
this decision opportunity. Also show decision tree for the same.
State of Demand
Decision Alternatives Decline (0.20) Same (0.30) Increase (0.50)
Don’t Plant 30 0 -40
Small Tree Farm -80 15 190
Large Tree Farm -550 -120 750
2. Some oil speculators are interested in drilling an oil well. The rights to the land have been
secured and they must decide whether to drill. The states of nature are that oil is present or that
no oil is present. Their two decision alternatives are drill or don’t drill. If they strike oil, the well
will pay 2 million. If they have a dry hole, they will lose 150,000. If they don’t drill, their
payoffs are 0 rs. when oil is present and 0 rs when it is not. The probability that oil is present is
.12. Use this information to construct a decision table and decision tree and compute an expected
monetary value for this problem.
3. A car rental agency faces the decision of buying a fleet of cars, all of which will be the same
size. It can purchase a fleet of small cars, medium cars, or large cars. The smallest cars are the
most fuel efficient and the largest cars are the greatest fuel users. One of the problems for the
decision makers is that they do not know whether the price of fuel will increase or decrease in
the near future. If the price increases, the small cars are likely to be most popular. If the price
decreases, customers may demand the larger cars. Following is a decision table with these
decision alternatives, the states of nature, the probabilities, and the payoffs. Use this information
to determine the expected monetary value for this problem.
State of Nature
Decision Alternatives Fuel Decrease (0.70) Fuel Increase (0.30)
Small Cars 225 450
Medium Cars -175 -135
Large Cars 400 380
1.8Activities
1. Suppose you have the option of investing either in Project A or in Project B. The outcomes of
both the projects are uncertain. If you invest in Project A, there is a 98% chance of making Rs.
25,000 profit, and 2% chance of losing Rs. 90,000. If project B is chosen, there is a 50-50 chance
of making a profit of Rs. 7,000 or Rs. 17,000. Which project will you choose and why?
2. Suppose in above activity 1, you have calculated the expected payoff (EMV) for both the
projects as follows.
EMVA = 0.98 * 25,000 - 0.02 * 90,000 = Rs. 26,300.
EMVB = 0.5 * 7,000- 0.5 * 17,000 = Rs. 12,000.
You have thus found that by investing in Project A, you can expect more money, so you have
chosen A. Your friend, when given the same option, chooses B, arguing that he would not like to
go bankrupt (losing 90,000) by choosing A. How do you reconcile these two arguments?
(b) A smaller scale project (B) to re-decorate her premises. At Rs.500,000 this is less costly but
will produce a lower pay-off. Research data suggests a 30% chance of a gain of Rs.1,000,000 but
a 70% chance of it being only Rs.500,000.
(c) Continuing the present operation without change (C). It will cost nothing, but neither will it
produce any pay-off. Clients will be unhappy and it will become harder and harder to rent the
flats out when they become free.
1. Business Statistics: For Contemporary Decision Making, by Ken Black, Wiley Publication
2. Quantitative Techniques in Management, by N.D. Vora, McGraw hills
3. Operations Research theory and Applications, by J.K. Sharma, Macmillan
4. Operations Research, By Hamdy A Taha, Pearson Education
5. Quantitative Analysis for Management, by Barry Render, Ralph M. Stair, Michael E. Hanna
and T. N. Badri, Pearson Publication
6. Statistics for management, Levin and Rubin, Pearson Education
7. Business Statistics, David M. levine et al, Pearson Education
8. Use of software like QM for Windows, Excel Solver
Unit No. 2 Correlation and Regression
Analysis
_________________________________
Unit Structure
2.0 Learning Objectives
2.1 Introduction
2.7 Glossary
2.8 Assignment
2.9 Activities
2.1Introduction
In industry and business today, large amounts of data are continuously being generateand
thus it calls for statistical analysis of mass data. Data is an asset for any business.This
data can be company's annual production, annual sales, capacity utilization, turnover,
profits, man-power levels, absenteeism or some other variable of direct interest to
management. In general, the data can be of any of the aspects related to finance,
marketing, human resource, inventory, productionor there might be technical data
regarding processes such as temperature, pressure etc. Sometimes it is related to quality
control issues. The accumulated data can be used to gain information about the system
(as for instance what happens to the market return when Sensex goes down) or to identify
past pattern of trends, behavior or simply used for control purposes to check if the
process or system is operating as planned and designed (as for instance in quality
control). So main objective to learn correlation and regression is primarily for extracting
the main features of the relationships and impacts hidden in or implied by the mass of
data.
The data we analyzecan have many variables and it is of interest to examine the effects
that some variables on others. To identify the exact functional relationship between
variables can be too complex but we may wish to approximate relationship by some
simple mathematical function such as correlation and straight line or least square line.
For instance, the monthly consumption of raw materials at a particular company, daily
demand of a particular product, weekly price change in petrol could all be variables of
interest. We are, however, interested in some key performance variables (let us consider
sales and advertisement) would like to see how this key variable (called the response
variable or dependent variable, here sales) is affected by the other variables (often called
independent or explanatory variable, here advertisement).
Example I
A study is designed to check the relationship between smoking and longevity. A sample of 15
men 50 years and older was taken and the average number of cigarettes smoked per day and their
age at death was measured. Here cigarettes smoking is independent variable (X) and Longevity
is dependent variable (Y). n is number of pairs = 15
Put all the calculated values in the formula learned above. Answer is = -0.71343, so moderate
negative (Variables are related reciprocally) correlation between two variables. In conclusion, if
cigarettes smoking is less, then Longevity of life is more.
a)0 and 1
b)-1 to 0 to 1
c)-1
d)None of the above
ŷ = b0 + b1x
where
Y is the dependent variable (that’s the variable that goes on the Y axis), X is the independent
variable (i.e. it is plotted on the X axis), b is the slope of the line and a is the y-intercept.
Example II
In the table below, the xi column shows scores on the aptitude test andyi column shows statistics
grades. Conduct the regression analysis, residual analysis and standard error of estimate.
Student Aptitude Statistics (𝑥− 𝑥)2 (𝑦− 𝑦)2 (𝑥− 𝑥̅)(𝑦− 𝑦̅)
Marks (x) Marks (y)
1 95 85 289 64 136
2 85 95 49 324 126
3 80 70 4 49 -14
4 70 65 64 144 96
5 60 70 324 49 126
∑x = 390 ∑y = 385 Σ(𝑥− 𝑥)2 = 730 Σ(𝑦− 𝑦̅)2 = 630 Σ (𝑥− 𝑥̅)(𝑦− 𝑦̅) = 470
Mean Mean
𝑥̅ = 78 𝑦̅ = 77
Once we know the value of the regression coefficient (b1), we can solve for the regression slope
(b0):
b0 = 𝑦̅− 𝑏1 𝑥̅
b0 = 77 - (0.644)(78)
b0 = 26.768
Now you can predict value of statistics marks(Y) by any value of aptitude marks (X). Let us
consider that if student scores 88 marks in aptitude test, what will his/her score in statistics?
Here, X = 88, transfer this value in developed regression equation: ŷ = 26.768 + 0.644 * 88 =
83.44 marks in statistics.
Each difference between the actual y values and the predicted y values is the error ofthe
regression line at a given point, and is referred to as the residual. It is the sum ofsquares of these
residuals that is minimized to find the least squares line. You can find predicted y values by
putting x values one by one in regression line that has been already developed.
Residuals represent errors of estimation for individual points.With large samples of data,residual
computations become laborious. Even with computers, a researcher sometimeshasdifficulty
working through pages of residuals in an effort to understand the error of theregression model.
An alternative way of examining the error of the model is the standarderror ofthe estimate, which
provides a single measurement of the regression error.Because the sum of the residuals is zero,
attempting to determine the total amount oferror by summing the residualsis fruitless. This zero-
sum characteristic of residuals can beavoided by squaring the residuals and then summing them.
A widely used measure of fit for regression models is the coefficient of determination, or r 2.
The coefficient of determination is the proportion of variability of the dependent variable(y)
accounted for or explained by the independent variable (x). The coefficient of determination
ranges from 0 to 1. An r 2 of zero means that the predictor accounts for none of the variability of
the dependent variable and that there is no regression prediction of y by x. An r 2 of 1 means
perfect prediction of y by x and that 100% of the variability of y is accounted for by x. Of course,
most r 2 values are between the extremes. The researcher must interpret whether a particular r 2
is high or low, depending on the use of the model and the context within which the model was
developed. In the correlation example answer is r = -0.71, so square of that is = -.5041. That
means 50% of the variation is explained by independent variable x on dependent variable y.
Note:
Because r 2 is always positive, solving for r by taking gives the correct magnitude of r but may
give the wrong sign. The researcher must examine the sign of the slope of the regression line to
determine whether a positive or negative relationship exists between the variables and then
assign the appropriate sign to the correlation value.
a) 0.6471
b) -0.6471
c) 0
d) 1
2. Suppose the correlation coefficient between height (as measured in feet) versus
weight (as measured in pounds) is 0.40. What is the correlation coefficient of height
measured in inches versus weight measured in ounces? [12 inches = one foot; 16 ounces
= one pound]
a) 0.40
b) 0.30
c) 0.533
d) Cannot be determined from information given
1. b
2. c
3. c
Answers to check your progress 2
1. c
2. c
3. a
1. a
2. a
3. r2 = 0.826
2.7 Glossary
Independent variable: A variable that can be set either to a desirable value or takes
values that can be observed but not controlled.
Estimate: A value obtained from data for a certain parameter of the assumed model
or a forecast value obtained from the model.
2.8 Assignment
1.Data on advertisingexpenditures (AE) and revenue (R) for the Four Seasons Restaurant is
givenbelow. Figures are in1000s.
AE 1 2 4 6 10 14 20
R 19 32 44 40 52 53 54
Answer Following:
c)Suppose SSR = 691 and SST = 1002. Find the value of R2 and interpret the same in the
context of the problem
2.Use the following data to determine the correlation equation of the least square regressionline.
X 12 21 28 8 20
Y 17 15 22 19 24
3.What is the measure of correlation between the interest rate of federal funds and
thecommodities futures index? Use the following data:
4.Find the equation of the regression line for the following data and compute the residuals.
X 15 8 21 15 6 8 3
Y 45 38 55 46 24 33 49
2.9 Activities
A student is required to collect the stock price and stock return of last 15 days of any particular
stock from “money control”. Now, identify independent and dependent variable, find
pearsoncorrelation coefficient and regression line and comment on the outcome.
According to the Capital Asset Pricing Model (CAPM), the risk associatedwith a capital asset
isproportional to the slope β1 (or simply β : Regressioncoefficient Y on X) obtained by
regressingthe assets past returns with thecorresponding return of the average portfolio called the
marketportfolio. (Thereturn of the market portfolio represents the return earned by the
averageinvestor.It is a weighted average of the returns from all the assets in themarket. The
larger the slope of β on of an asset, the larger is the riskassociated with that asset. A β of
1.00 representsaverage risk. The return fromIT firm’s stock and the corresponding returns for the
marketportfolio for thepast 10 years are given below:
3.1 Introduction
3.9 Glossary
3.10 Assignment
3.11 Activities
3.0Learning Objectives
After learning this unit, you will be able to,
3.1Introduction
Forecasting is a technique that in our day to day life or routine we use. Every day, forecasting is
used in the decision making as a science and ort of predicting and then planning future
accordingly. It is a process which helps business people to reach conclusions about buying,
selling, producing, hiring, planning, manufacturing, inventory management and many other
actions. As an example, consider the following:
• Market watchers predict a low and high price, return on stock values short term, medium
term and long term.
• City planners forecast rain, temperature etc. in a particular city.
• Rising demand of laptops
• Predicting the future for paper industry
• Life insurance outlooks for number of claims for the next year.
• Trends or change in demand for clothing or apparels over the period of time
• Change in habit of eating over the period of time
How are these and other conclusions reached? What forecasting techniques are used? Arethe
forecasts accurate? Here we will discuss several forecasting techniques, how to measurethe error
of a forecast, and some of the problems that can occur in forecasting.Managers are always trying
to reduce uncertainty and make better estimates of what will happen in the future.This is the
main purpose of forecasting.So, in this unit will focus only on quantitative and causal models
where data occur over time, time-series data.Time-series data are data gathered on a given
characteristic over a period of time atregular intervals. Time-series forecasting techniques
attempt to account for changes over time by examining patterns, cycles, or trends, or using
information about previous timeperiods to predict the outcome for a future time period. Time-
series methods include Moving averages, Exponential smoothing, Least square regression trend
analysis.
Delphi Method: This is an iterative group process where (possibly geographically dispersed)
respondents provide input to decision makers.
Sales Force Composite:This allows individual salespersons estimate the sales in their region
and the data is compiled at a district or national level.
2. Time-series models: attempt to predict the future based on the past.Common time-series
models are:Moving average, Exponential smoothing, Trend projections.
3. Causal models: use variables or factors that might influence the quantity being forecasted.The
objective is to build a model with the best statistical relationship between the variable being
forecast and the independent variables.Regression analysis is the most common technique used
in causal modeling.
Check your progress 1
1.To apply causal model approach which following concept can be used:
a) Regression Analysis
b) Decision Theory
C) Moving Average
d) Exponential Smoothing
A time series is a sequence of evenly spaced events.Time-series forecasts predict the future
based solely on the past values of the variable, and other variables are ignored.
3.4.1 Components of Time-Series Analysis: A time series typically has four components:
Trend (T): is the gradual upward or downward movement of the data over time. This is for the
longer period of time generally more than five years. Trend change in preference of mobile
phones, selection of new homes etc.
Seasonal Change (S): is a pattern of demand fluctuations above or below the trend line that
repeats at regular intervals. This is the year by year, month by month change. i.e. Flu disease
every year during monsoon season. Generally, for short period of time, for less than a year.
Cycles (C): are patterns in annual data that occur every several years. i.e. Every 5 years election
is there for choosing new prime minister, every 10 years Census calculation is done by
government etc.
Random/Irregular variations (R): data caused by chance or unusual situations, and follow no
discernible pattern. There is no time period here, data can change rapidly or slowly at any point
of time.
Moving averages can be used when demand is relatively steady over time.The next forecast is
the average of the most recent n data values from the time series. This method tends to smooth
out short-term irregularities in the data series.
Moving Average Forecast = Sum of demand in previous n periods / n
Mathematically,
Where,
Ft+1 = forecast for time period t + 1
Yt = actual value in time period t
n = number of periods to average
Example I
The demand for a product in each of the last five months is shown below.
Month 1 2 3 4 5
Demand ('00s) 13 17 19 23 24
Solution of Example I
The two-month moving average for months two to five is given by:
The forecast for month six is just the moving average for the month before that i.e. the moving
average for month 5= m5 = 2350.
Exponential smoothing is a type of moving average that is easy to use and requires little record
keeping of data. the new estimate is the old estimate plus some fraction of the error in the last
period. The general approach is to develop trial forecasts with different values of and select the
with lowest mean absolute deviation (MAD) which will be discussed in next section.
New forecast = Last period’s forecast+ * (Last period’s actual demand – Last period’s
forecast)
Where,
is a weight (or smoothing constant) in which 0≤≤1.
Mathematically,
Ft+1 = Ft + * (Yt – Ft)
Where:
Ft+1 = New forecast (for time period t + 1)
Ft = Pervious forecast (for time period t)
= Smoothing constant (0 ≤ ≤ 1)
Yt = Pervious period’s actual demand
Example II
In January, February’s demand for a certain car model was predicted to be 150.Actual February
demand was 166 autos. Using a smoothing constant of = 0.20, what is the forecast for March?
Solution of Example II
New forecast (for March demand) = 150 + 0.2(166 – 150)= 153.2 or 153 autos
If actual demand in March was 146 autos, the April forecast would be:
New forecast (for April demand) = 153.2 + 0.2(146 – 153.2)= 151.76 or 152 autos
Comparison of forecasted values with actual values to see how well model works. There are
several measures available for measuring accuracy as depicted below:
Example III
The table below shows the demand for a new aftershave in a shop for each of the last 7 months.
Month 1 2 3 4 5 6 7
Demand 23 29 33 40 41 43 49
a) Calculate a two-month moving average for months two to seven. What would be your forecast
for the demand in month eight?
b) Apply exponential smoothing with a smoothing constant of 0.1 to derive a forecast for the
demand in month eight.
c) Which of the two forecasts for month eight do you prefer and why?
Solution of Example III
a) The two-month moving average for months two to seven is given by:
The forecast for month eight is just the moving average for the month before that i.e. the moving
average for month 7 = m7 = 46.
M1 = Y1 = 23
M2 = 0.1Y2 + 0.9M1 = 0.1(29) + 0.9(23) = 23.60
M3 = 0.1Y3 + 0.9M2 = 0.1(33) + 0.9(23.60) = 24.54
M4 = 0.1Y4 + 0.9M3 = 0.1(40) + 0.9(24.54) = 26.09
M5 = 0.1Y5 + 0.9M4 = 0.1(41) + 0.9(26.09) = 27.58
M6 = 0.1Y6 + 0.9M5 = 0.1(43) + 0.9(27.58) = 29.12
M7 = 0.1Y7 + 0.9M6 = 0.1(49) + 0.9(29.12) = 31.11
As before the forecast for month eight is just the average for month 7 = M7 = 31.11 = 31 (as we
cannot have fractional demand).
c) To compare the two forecast we calculate the mean squared deviation (MSD). If we do thiswe
find that for the moving average
and for the exponentially smoothed average with a smoothing constant of 0.1
Overall then we see that the two-month moving average appears to give the best one month
ahead forecasts as it has a lower MSD. Hence, we prefer the forecast of 46 that has been
produced by the two-month moving average. Same way MSE can be used to compare the results
and come up with the final decision.
Check your progress 2
1. Increase in the number of patients in the hospital due to heat stroke is:
(a) Secular trend (b) Irregular variation (c) Seasonal variation (d) Cyclical variation
2. An orderly set of data arranged in accordance with their time of occurrence is called:
(a) Arithmetic series (b) Harmonic series (c) Geometric series (d) Time series
5. Damages due to floods, droughts, strikes fires and political disturbances are:
(a) Trend (b) Seasonal (c) Cyclical (d) Irregular
Example IV
The sales of a company (in thousand rupees) for each year are shown in the table below.
y (sales) 12 19 29 37 45
y (sales) 12 19 29 37 45
t y t*y t2
0 12 0 0
1 19 19 1
2 29 58 4
3 37 111 9
4 45 180 16
We now calculate a and b using the least square regression formulas for a and b.
3.If values of a = 11.8 and b = 19 when years from 2008 to 2015. Regression Equation is
_______________and forecast for 2018_____________
The unit mainly focus on the importance of forecasting in all our short term, medium term
and long-term planning decisions. For long term planning decisions, qualitative techniques
like technological forecasting, expert opinions through Delphi or opinion polls using
personal interviews or questionnaires. Formedium-term and short-term decisions, apart from
subjective and intuitive methods there is a wide variety of statistical techniques that could be
employed. The methods like Moving averages or exponential smoothing that are based
onpast data. Any suitable mathematical function can be fitted to the demand history by using
least squares regression. Regression is also used in estimation of parameters of causal or
econometric models.
1. a
2. Qualitative, Feelings/Intuition
3. Future Demand
1. c
2. d
3. d
4. b
5. d
1. b
2. d
Moving Average: An average computed by considering the N most recent (for a K-period
moving average) demand points, commonly used for short term forecasting.
Prediction: A term to denote the estimate or guess of a future variable that may bearrived at by
subjective feelings or intuition.
Regression: From a given demand history to establish a relation between thedependent variable
(such as demand) and independent variable. These relations are important to plan future
demands.
Time Series: Any data on demand, sales or consumption taken at regular intervals oftime is a
time series. Analysis of this time series to discover patterns ofgrowth, demand, seasonal trends or
random fluctuations is known as Time Serieanalysis.
Causal Models: Forecasting models wherein the demand or variable or interest is related to
impact analysis or causal variables.
Delphi: A method of collecting information from experts, useful for long term forecasting. It is
iterative and maintains confidentiality to reduce subjective bias.
3.10 Assignment
1. The table below shows the demand for a particular brand of razor in a shop for each of the last
nine months.
Month 1 2 3 4 5 6 7 8 9
Demand 10 12 13 17 15 19 20 21 20
a)Calculate a three-month moving average for months three to nine. What would be your
forecast for the demand in month ten?
b)Apply exponential smoothing with a smoothing constant of 0.3 to derive a forecast for the
demand in month ten.
c) Which of the two forecasts for month ten do you prefer and why?
2. The table below shows the demand for a particular brand of fax machine in a department store
in each of the last twelve months.
Month 1 2 3 4 5 6 7 8 9 10 11 12
Demand 12 15 19 23 27 30 32 33 37 41 49 58
a) Calculate the four-month moving average for months 4 to 12. What would be your forecast for
the demand in month 13?
b) Apply exponential smoothing with a smoothing constant of 0.2 to derive a forecast for the
demand in month 13.
c) Which of the two forecasts for month 13 do you prefer and why?
3. Find the regression trend line for the following data of equity fund investment (In lakhs of
rupees per year) from 2001 to 2018.
3.11 Activities
1. You are required to collect the data of corona cases registered and recovered from march 20,
2020 to June 20,2020. Analyze the trend between two variables. And forecast the number of new
cases for the month of July, 2020.
2. Visit a manufacturing company which is established for at least 15 years. Select any product
of the company if they are manufacturing more than one product. Collect the data of price,
production, demand, sales year wise. Now identify the change in each variable data with respect
to years passed.
Following are the average yields of long-term new corporatebonds over a several-month period
published bythe Market Finance Department ofthe Treasury.
b) Use a 4-month moving average to forecast values for each of the ensuing months.
c)Use simple exponential smoothing to forecast values for each of the ensuing months. Let a = .3
and then let Which weight produces better forecasts?
d) Compute MAD for the forecasts obtained in parts (b) and (c) and compare the results.
In this block, we learned various techniques about the vital aspect of any business that is decision
making and forecasting. The decisions taken by applying quantitative methods may be used to
achieve optimum profit or cost and perhaps it can help to forecast future also. In the first unit,
one stage and multi stage decision making techniques were explained. Decision making under
uncertainty and risk along with the decision tree approach with certainty have been discussed. In
the second unit, linear relationships between independent and dependent variables were
discussed with the help of concepts like correlation, coefficient of determination and regression
analysis. In the last unit, the forecasting techniques with various models were explained. Third
unit also covered the time series analysis and least square regression analysis.
Block Assignment
If tenders are to be submitted the company will incur additional costs. These costs will have
to be entirely recouped from the contract price. The risk, of course, is that if a tender is
unsuccessful the company will have made a loss.
The cost of tendering for contract MS1 only is 50,000. The component supply cost if the
tender is successful would be 18,000.
The cost of tendering for contract MS2 only is 14,000. The component supply cost if the
tender is successful would be 12,000.
The cost of tendering for both contract MS1 and contract MS2 is 55,000. The component
supply cost if the tender is successful would be 24,000.
For each contract, possible tender prices have been determined. In addition, subjective
assessments have been made of the probability of getting the contract with a particular tender
price as shown below. Note here that the company can only submit one tender and cannot,
for example, submit two tenders (at different prices) for the same contract. Solve the
dilemma with decision tree approach.
5. Calculate the pearson product moment correlation coefficient and regression line for the
following data:
X = Price 11 12 13 14 16 15 17
Y = Amount Demanded 40 39 43 44 38 36 46
Block Structure
_________________________________
Block 3 Linear Programming Problem and
Special problems
_________________________________
Block Introduction
Operation research is always the vital part of any industry. The agenda of doing research on
operations is maximum utilization of available resources within given restrictions. As resources
are generally scare, there is a need of learning techniques which can help in achieving maximum
profit along with minimum cost. Thus, here in this blockwe will explore some of the most
common and useful techniques of linear programming problems for two or more variables. The
first unit describes about formulation of given problem into mathematical function and then
solve it with graphical analysis to come up with the decisions. Decisions are always regarding
two objectives either maximization of profit or minimization of cost. The second unit describes
about simplex method which is used when two or more decision variables are concerned for
utilizing available resources in best possible way to maximize profit. The third unit describes
about developing transportation schedule for the shipment from one source to another
destination. In the fourth unit we will explore the assignment concept that is useful to understand
the allocation of jobs/projects to employees, workers, machines with the scientific approach.
Block Objectives
After learning this block, you will be able to:
Block Structure
Unit 3: Transportation
Unit 4: Assignment
_________________________________
Unit No. 1 Linear Programming
formulation and Graphical Method
_________________________________
Unit Structure
1.0 Learning Objectives
1.1 Introduction
1.1Characteristics of LPP
1.9 Glossary
1.10 Assignment
1.11 Activities
1.12 Case Study
1.1 Introduction
Linear Programming Problem is widely used mathematical modeling technique designed to help
managers in planning and decision making relative to resource allocation. Resources include
machinery, labor, money, time, warehouse space, raw materials etc. It is a powerful technique for
helping managerial decision making for certain kinds of problems. The basic approach is to
formulate a mathematical model called a linear programming model to represent the problem and
then to analyze this model. Any linear programming model includes basic three parts: 1.
Decision variables to represent the decision to be made, 2. Constraints to represent the
restrictions on the feasible values of these decision variables, and 3. An objective function that
expresses the overall measure of performance for the problem.
Here, only graphical method for two decision variables is presented in this unit, easy and
efficient computational procedures known as algorithms are available to solve linear
programming problems. The development of various software has been helpful to solve these
problems with a large number of decision variables and constraints.
The formulation of a linear programming problem can be explained through product mix
problem. Typically, it occurs in a manufacturing industry where there is a requirement of
manufacturing variety of products with given set of resources. Each of the products has a certain
margin of profit per unit and cost per unit. These products use a common bunch of resources –
according to availability. The linear programming technique identifies the combination of the
products which will either maximize the profit or Minimize the cost without violating the
restrictions related to resources. So, the company would like to determine how many units of
each product it should produce so as to maximize overall profit or minimize overall production
cost. Basically, it involves two types of LPPs: Maximization (Profit) and Minimization (Cost).
Example I (Maximization)
The Jay Ambe Company produces two types of products tables and chairs.Processes are similar
in that both require a certain number of hours of carpentry work and in the painting
department.Each table takes 5 hours of carpentry and 2 hours of painting.Each chair requires 4 of
carpentry and 2 hours of painting.There are total 250 hours of carpentry time available and 110
hours of painting per week. Each table yields a profit of 65 Rs. and each chair a profit of 60 Rs.
Formulate this as a Linear Programming Problem.
Solution of Example I
A firm wants to determine the best combination of tables and chairs to produce to reach the
maximum profit.
Hours required to produce one unit
Department Tables (x1) Chairs (x2) Available Hours/Week
Carpentry 5 4 250
Painting 2 2 110
Profit Per Unit 65 60
The objective is to:Maximize profit
The constraints according to two resources are:
• The hours of carpentry time used cannot exceed 250 hours per week.
• The hours of painting time used cannot exceed 110 hours per week.
And we know that we can use total carpentry time or less than given time but not more than that
5x1 + 4x2 ≤ 250 (hours of carpentry time)
Similarly, for painting, the function is 2x1 + 2x2 ≤ 110, Both of these constraints restrict
production capacity and affect total profit.
Example II (Minimization)
Afarmisengagedinbreedingpigs.Thepigsarefedonvariousproducts
grownonthefarm.Withaviewtoensuringcertainminimumnutritionforthe
growthofthepigs,twotypesoffeedsAandBarepurchasedfrom themarket.
IffeedAcostsRs.20andBRs.40perunit.Thecontentsofthesefeedsperunit,innutrientconstituentsareas
giveninthe followingtable. Formulateas LPP.
Nutrientcontent
in feeds Minimumrequiremento
Nutrient
A B f feed nutrient fora
pig
M1 12 6 108
M2 3 9 81
M3 15 10 150
Solution of Example II
And we know that we are required to feed minimum or equal to nutrient amount but not less than
that
12A + 6B≥108 (Minimum Nutrient M1 Requirement)
Similarly, for Nutrient M2 and M3, the functionsare3A + 9B≥81 and 15A + 10B≥150
respectively.
From the above examples, we can see that with maximization type of problems constraints must
have “Less than or Equal to Sign” while in minimization type of problems constraints have
“Greater than or Equal to sign”. But Sometimes it can be a combination of both types “Less than
and Greater than types of constraints” according to availability of resources.
1.3Graphical Analysis
The easiest way to solve a small LPP is graphically.The graphical method only works whenthere
are just two decision variables. When there are more than two variables, a more complex
approach is needed as it is not possible to plot the solution on a two-dimensional graph.The
graphical method provides valuable insight into how other approaches work.
Example III
Step 1
The first step in solving the problem is to identify a set or region of feasible solutions.To do
thiswe plot each constraint equation on a graph.
(x1 = 60, x2 = 0)
Step 2
Drilling Constraint
1
4
Step 3
In above graph, there is a feasible region which means “The region which satisfies all the
constraints”. For drilling and milling constraints, maximum availability is 240 and 100
respectively so identify the common region which satisfies both the constraints. (For common
feasible region identification, consider the sign of constraint in terms of “Less than”, “Greater
Than” or “Equal To”).
Once the feasible region has been graphed, we need to find the optimal solution from the many
possible solutions. This approach is known as Corner Point Method. It involves looking at the
profit at every corner point of the feasible region.The mathematical theory behind LP is that the
optimal solution must lie at one of the corner points, or extreme point, in the feasible region.For
this example, the feasible region is a four-sided polygon with four corner points labeled 1, 2, 3,
and 4 on the graph.
To find the coordinates for Pointaccurately we have to solve for the intersection of the two
constraint lines.Using the simultaneous equations method, we multiply the Milling equation by –
2 and add it to the Drilling equation
Find the final solution by putting all x1 and x2 values in objective function.
Because Point 3 returns the highest profit, this is the optimal solution.
Slack is the amount of a resource that is not used. For a less-than-or-equal constraint:
In Example 3, Optimal solution is x1=30 and x2=40, put these values in 4x1 + 3x2 = 240
4(30) + 3(40) = 240, here LHS = RSH So No Slack
In Example 3, Optimal solution is x1=30 and x2=40, put these values in 2x1 + 1x2 = 100
2(30) + 1(40) = 100, here LHS = RSH So No Slack, No Surplus
Surplus is used with a greater-than-or-equal constraint to indicate the amount by which the
right-hand side of the constraint is exceeded.
For Example, If actual amount is 240 but minimum requirement is 160 only so you will have
remaining value of (240 – 160 = 40) as surplus with you.
If any two points are selected in the region and the line segment formed by joining these two
points lies completely on the boundary of the feasible region then it is a Convex Set
i.e. Feasible region is always convex set
If any two points are selected in the region and the line segment formed by joining these
twopoints do not lie on the boundary of the feasible region then it is a Non-Convex Set
Check your progress 2
1. A feasible solution of LPP
A) Must satisfy all the constraints simultaneously
B) Need not satisfy all the constraints, only some of them
C) Must be a corner point of the feasible region
D) all of the above
2. The objective function for a L.P model is 3x1+2x2, if x1=20 and x2=30, what is the value of
the objective function?
A) 0
B) 50
C) 60
D) 120
3. The graphical method can only be used when there are _____ decision variables
1.4Types of constraints
1. Binding Constraints: If in the constraints LHS = RHS when optimal values of the
decision variables are substituted into the constraints then those constraints are binding
constraints
2. Non - Binding Constraints: If in the constraints LHS ≠ RHS when optimal values of the
decision variables are substituted into the constraints then those constraints are Non-
binding constraint
3. Redundant Constraints: When a constraint, when plotted, does not form part of the
boundary marking the feasible region of the problem, it is said to be Redundant
It does not affect the optimal solution to the problem
1.5Special Cases
1.5.1 Multiple Optimal Solutions: A solution which have similar values of profits or costs so
not unique but more than one optimal solution are possible.
Example IV
Subject to
4x1+ 3x2 ≤ 24
x1 ≤ 4.5
x2 ≤ 6
x1 ≥ 0 , x2 ≥ 0
Solution of Example IV
The corner points of feasible region are A, B, C and D. So the coordinates for the corner points
are
A (0, 6)
B (1.5, 6) (Solve the two equations 4x1+ 3x2 = 24 and x2 = 6 to get the coordinates)
C (4.5, 2) (Solve the two equations 4x1+ 3x2 = 24 and x1 = 4.5 to get the coordinates)
D (4.5, 0)
We know that Max Z = 4x1 + 3x2
At A (0, 6)
Z = 4(0) + 3(6) = 18
At B (1.5, 6)
Z = 4(1.5) + 3(6) = 24
At C (4.5, 2)
Z = 4(4.5) + 3(2) = 24
At D (4.5, 0)
Z = 4(4.5) + 3(0) = 18
Max Z = 24, which is achieved at both B and C corner points. It can be achieved not only at B
and C but every point between B and C. Hence the given problem has multiple optimal
solutions.
1.5.2 Unbounded Solution: A solution which increases or decreases the value of objective
function of the LP problem indefinitely is called unbounded solution. Generally, when
maximization type of problem with all constraints have “greater than or equal to sign”. Then
there is no limit to go up to upper side.
Example V
Solution of Example V
A (0, 7)
B (1, 5) (Solve the two equations 2x1+ x2 = 7 and x1+ x2 = 6 to get the coordinates)
C (4.5, 1.5) (Solve the two equations x1+ x2 = 6 and x1+ 3x2 = 9 to get the coordinates)
D (9, 0)
We know that Max Z = 3x1 + 5x2
At A (0, 7)
Z = 3(0) + 5(7) = 35
At B (1, 5)
Z = 3(1) + 5(5) = 28
At C (4.5, 1.5)
Z = 3(4.5) + 5(1.5) = 21
At D (9, 0)
Z = 3(9) + 5(0) = 27
The values of objective function at corner points are 35, 28, 21 and 27. But there exists infinite
number of points in the feasible region which is unbounded. The value of objective function will
be more than the value of these four corner points i.e. the maximum value of the objective
function occurs at a point at ∞. Hence the given problem has unbounded solution.
1.5.3 Infeasibility: The set of values of decision variables which do not satisfy all the
constraints and non-negativity conditions of an LP problem simultaneously is said to constitute
the infeasible solution to that linear programming problem. In common, when it is not possible to
find common region that satisfies all constraints simultaneously.
Example VI
Subject to
x1+ x2 ≤ 1
x1+ x2 ≥ 3
x1 ≥ 0 , x2 ≥ 0
Solution of Example VI
There is no common feasible region generated by two constraints together i.e. we cannot identify
even a single point satisfying the constraints. Hence there is no optimal solution.
4. Infinitive feasible solutions but none of them can be termed as an optimal solution is
known as ______________ special case of LPP.
5. If one or more optimal solutions have same value as maximum profit or minimum cost is
termed as ________________________
Marketing Research / Consumer Research: To minimize the cost of research according to the
constraints
Production Mix: Number of units of production for one or more different products for
Maximizing the profit or Minimizing the cost
Ingredient Mix: Ingredient Mixing proportion decision for making one or more products
Financial Portfolio Selection: Maximizing return on investment subject to a set of risk factors
1.7Let Us Sum Up
In this unit, we started with the general introduction of linear programming problem followed by
identification of thedecisionvariableswhichare
withsomeeconomicorphysicalquantitieswhosevaluesareof major interest to the management. The
problem must have a well-defined objective function expressed in terms of the decision
variables. The objective function must be maximized when it expresses the profit or contribution.
In case the objective function indicatesacost,itmust beminimized. When a problem of
management is expressed in terms of the mathematical function by using decision variables with
appropriate objective function andconstraints, the problem has been formulated. A linear
programming problem with only two decision variables can be solved graphically. Any non-
negative solution which satisfies all the constraints is known as a feasible solution of the
problem. The common region which satisfies all the constraints is known as a feasible region.
The value of the decision variables which maximize or minimize the objective function is located
on the extreme point of the convex set (Feasible Region) formed by the feasible solutions. From
the all feasible solutions, there can be one or more optimal solutions. Sometimes the problem
may be infeasible indicating that no feasible solution of the problem exists. Sometimes there is
no boundary to form the convex set and thus number of multiple optimal solutions can be
considered but none of them can be termed as an optimal solution. The different applicability of
linear programming is also discussed in this unit.
Subject to,
5x1 + 2x2 ≥ 14 (Good Iron Constraint)
3x1 + 2x2 ≥ 9 (Mediocre Iron Constraint)
4x1 + 10x2 ≥ 22 (Bad Iron Constraint)
x1, x2 ≥ 0 (Non-negativity constraint)
2. False
3. A
1. A
2. D
3. Two
4. Feasible
Answers to check your progress 3
1. Binding
2. D
3. True
4. Unboundedness
1.9 Glossary
Decision Variables: are economic or physical quantities whose numerical values indicate the
solution of the linear programming problem.
The Objective Function: of a linear programming problem is a linear function of the decision
variables expressing the objective of the decision maker.
Constraints: of a linear programming problem are linear equations or inequalities arising out of
practical limitations.
A Feasible Solution: of a linear programming problem is a solution which satisfies all the
constraints including the non-negativity constraints.
A Redundant Constraint: is a constraint which does not affect the feasible region.
A Convex Set: is a collection of points such that for any two points on the set, the line joining
the points belongs to the set.
Non-Convex Set:If any two points are selected in the region and the line segment formed by
joining these twopoints do not lie on the boundary of the feasible region.
Multiple Solutions: of a linear programming problem are solutions each of which maximize or
minimize the objective function.
1. A retired person wants to invest up to an amount of Rs. 30,000 in fixed income securities. His
broker recommends investing in tow bonds: Bonds A yielding 7% and Bond B yielding 10%.
After some consideration, he decides to invest at most Rs. 12,000 in Bond A and least Rs. 6000
in Bond A. He also wants the amount invested in Bond A to be at least equal to the amount
invested in Bond What should the broker recommended if the investor wants to maximize his
return on investment? Solve graphically.
2. A firm manufactures two products TV & DVD player which must be processed through two
processes, Assembly and Finishing. Assembly 90 hours available and finishing has 82 hours
available. For 1 TV set requires 5 hours in assembly and 3 hours in finishing while 1 DVD
player set requires 6 hours in assembly and 4 hours in finishing. If profit is Rs. 900 per TV and
Rs. 600 per DVD player set, find out the best combination of TV and DVD player set to realize a
maximum profit.
3. A rubber company is engaged in producing three different types of tyres A, B, and C. The
company has two production plants to produce these. In a normal eight hour working day, plant I
produces 100, 200 and 200 tyres of types A, B and C respectively. Plant II produces 120, 120,
and 400 tyres of type A,B, and C respectively. The monthly demand of A, B, and C is 5000,
6000 and 14000 units respectively. The daily cost of operation of plants I and II are Rs. 5000 and
Rs. 7000 respectively. Find the minimum number of days to operation per month at two different
plants to minimize the total cost while meeting the demand using graphical method.
5. A firm uses lathes, milling and grinding machines to produce two parts. Following table
represents the machining times required for each part, available machine time on different
machines and the profit values:
Visit a manufacturing company, collect the data regarding any two types of products they
produce which use common any number of resources, cost or profit per unit of product,
minimum or maximum availability of resources, number of hours or kgs etc. require to produce
one unit of product. Then prepare a table of the information, formulate as LPP and solve
graphically to identify optimal cost or profit
1)Solve the problem using Graphical to determine the optimum product mix of capacitors and
resistors for the next month. Also, determiner corresponding optimum achievable profit due to
sells of Resistors and Capacitors. Which facilities are fully utilized and which resources are left
unused at the optimal stage?
2)Are there alternate (multiple) optimal solutions available to Mr. Pavan Kumar? If so, suggest
another solution.
2.1 Introduction
2.5 Glossary
2.6 Assignment
2.7 Activities
2.1 Introduction
The graphical method that you have learned in block 2 unit 1 of solving linear programming
problem is a vitalhelp to understand basic structure of problem, the method has limited
application in industrial problems as the number of variables occurring are always substantially
large.A more useful method known as Simplex Method is suitable for solving linear
programming problems with a larger number of variables means more than or equal two
variables. The method through an iterative process progressively approaches and finally reaches
to the maximum or minimum value of the objective function. The method also helps the decision
maker to identify the redundant constraints, an unbounded solution, multiple solution and an
infeasible problem.
Every linear programming problem has a dual problem associated with it. The solution of this
problem is readily obtained from the solution of the original problem if simplex method is used
for this purpose. The variables of dual problem are known as dual variables or shadow price of
the. various resources. The solution of the dual problem can be used by the decision maker for
augmenting the resources.
Simplex method was developed by G. Danztig in 1947. The simplex method provides an
algorithmwhich is based on the fundamental theorem of linear programming. The Simplex
algorithm is an iterative procedure for solving LP problems in a finite number of steps. It
consists of following:
• Having a trial basic feasible solution to constraint-equations
• Testing whether it is an optimal solution
• Improving the first trial solution by a set of rules and repeating the process till an optimal
solution is obtained
2.2.1 Algorithm of simplex method
To solve a linear programming problem in standard form, use the following steps.
1. Convert each inequality in the set of constraints to an equation by adding slack variables.
2. Create the initial simplex tableau and Calculation of Z andj and test the basic feasible solution
for optimality.
3. This step is to improve the basic feasible solution, the vector entering the basis matrix and the
vector to be removed from the basis matrix are determined. Locate the highest negative entry in
the bottom row. The column for this entry is called the entering column. (If ties occur, any of the
tied entries can be used to determine the entering column.). Now find minimum ratio considering
column respective to incoming variable. Select the minimum value as outgoing variable from
minimum ratio. (If negative minimum ratio then never considers it). Intersection point of
incoming variable column and outgoing variable row is selected.
4. Mark the key element at the intersection of incoming and outgoing variable. divide all the
elements of that row by the key element.Then subtract appropriate multiples of this new row
from the remaining rows, so as to obtain zeroes in the remaining position of the respective
column.
6. If all entries in the bottom row are zero or positive, this is the final tableau.
Consider following example of linear programming problem to understand simplex method basic
principles. In simplex method the objective function is to be maximized always, not
minimized.
Subject to,
-x1 + x2 ≤ 11
x1 + x2 ≤ 27
2x1 + 5x2≤ 90
x1, x2 ≥ 0
Since the left-hand side of each inequality is less than or equal to the right-hand side, there must
exist nonnegative numbers and that can be added to the left side of each equation to produce the
following system of linear equations. The numbers and are called slack variables because they
take up the “slack” in each inequality. Remember that slack variables are counted only for
constraints not for objective function.
Subject to,
-x1 + x2 + s1 = 11
x1 + x2 + s2 = 27
2x1 + 5x2 + s3 = 90
A basic solution of a linear programming problem in standard form is a solution of the constraint
equations in which at most m variables are nonzero. The variables that are nonzero are called
basic variables. A basic solution for which all variables are nonnegative (positive) is called a
basic feasible solution.
Procedure to test the basic feasible solution for optimality by the rules given:
Rule 1: If all j ≥ 0, the solution under the test will be optimal. Alternate optimal solution will
exist if any non-basic j is also zero.
Rule 2: If atleast one j is negative, the solution is not optimal and then proceeds to improve the
solution in the next step.
Example I
Subject to
x1 + x2 ≤ 4
x1 – x2 ≤ 2
and x1 ≥ 0, x2 ≥ 0
Solution of Example I
1. Convert each inequality in the set of constraints to an equation by adding slack variables.
Subject to
x1 + x2+ s1= 4
x1 – x2 + s2= 2
x1 ≥ 0, x2 ≥ 0, s1 ≥ 0, s2 ≥ 0
2. Create the initial simplex tableau and Calculation of Z andj and test the basic feasible solution
for optimality.
The simplex method is carried out by performing elementary row operations on a matrix that we
call the simplex tableau. This tableau consists of thematrix corresponding to the coefficients of
constraints together with the coefficients of the objective function written in the specific form in
the tableau. Objective function values at the initial simplex tableau are always considered
negative.
Calculation of Z and j and test the basic feasible solution for optimality by the rules
given.
For, Z= CB XB = (0 *4 + 0 * 2) = 0
For below points Cij = coefficients of objective function for x1, x2, s1, s2.
x1 = CB X1 – Cj =( 0 * 1 + 0 * 1) – 3 = -3
x2 = CB X2 – Cj =( 0 * 1 + 0 * -1) – 2 = -2
s1 = CB X3 – Cj = (0 * 1 + 0 * 0) – 0 = 0
s2 = CB X4 – Cj = (0 * 0 + 0 * 1) – 0 = 0
In this problem it is observed that there are negative values -3 and -2. Hence proceed to improve
this solution.
3.This step is to improve the basic feasible solution, the vector entering the basis matrix and
thevector to be removed from the basis matrix are determined. Locate the highest negative entry
in the bottom row. The column for this entry is called theentering column. (If ties occur, any of
the tied entries can be used to determine the enteringcolumn.). Now find minimum ratio
considering column respective to incoming variable. Select the minimum value as outgoing
variable from minimum ratio. (If negative minimum ratio then never considers it). Intersection
point of incoming variable column and outgoing variable row is selected.
4. Mark the key element at the intersection of incoming and outgoing variable. divide all the
elements of that row by the key element.Then subtract appropriate multiples of this new row
from the remaining rows, so as to obtain zeroes in the remaining position of the column Xk.
Here key element is 1, so divide respective second row by value 1. Related calculation is shown
below.
Use (R1=R1 – R2) for first row calculation that is 1-1 = 0, 1-(-1)=2, 1-0 = 1, 0-1 = -1, 4-2 = 2
respectively.
6, 0, -5, 0, 3 are calculated as explained in step 2. Still one value is negative -5, so this is not an
optimal solution.
Z= 11 0 0 5/2 1/2
6. If all entries in the bottom row are zero or positive, this is the final tableau. The variables
which has value is known as non-basic variables.
As all the values are positive, this is an optimal solution, XB values are the solution so answer is
X1 = 3 and X2 = 1, thus maximum profit Z = 3*x1 + 2*x2 = 11
Example II
Solution of Example II
Maximize Z = 80x1 + 55x2 + 0s1 + 0s2
Subject to
4x1 + 2x2+ s1= 40
2x1 + 4x2 + s2= 32
x1 ≥ 0, x2 ≥ 0, s1 ≥ 0, s2 ≥ 0
Cj → 80 55 0 0
Basic CB XB X1 X2 S1 S2 Min ratio
Variables XB /Xk
s1 0 40 4 2 1 0 40 / 4 = 10→ outgoing
s2 0 32 2 4 0 1 32 / 2 = 16
↑incoming
Z= CB XB = 0 -80 -55 0 0
x1 80 10 1 ½ 1/4 0 10/1/2 = 20
↑incoming
Z = 800 0 -15 40 0
x1 80 8 1 0 1/3 -1/6
x2 55 4 0 1 -1/6 1/3
Z = 860 0 0 35/2 5
Answer is X1= 8 and x2 = 4, so Z = 860
Check your progress 1
1. In the simplex method, a tableau is optimal only if all the Z values at the end of the solution:
(a) zero or negative.
(b) zero.
(c)negative and nonzero.
(d) positive and zero.
2. Linear programming problem involving more than two variables can be solved by:
Subject to,
x1 + 2x2 + 2x3 ≤ 8
3x1 + 2x2 + 6x3 ≤ 12
2x1 + 3x2 + 4x3 ≤ 12
x1, x2, x3 ≥ 0
The simplex method is the appropriate method for solving a linear programming problem with
more than two decision variables. For less than or equal to type constraints slack variables are
introduced to make inequalities equations. A type of solution known as a basic feasible solution
is important for simplex computation. A basic feasible solution of a system with m equations and
n variables has m non negative variables known as basic variables and n-m variables with value
zero known as non-basic variables. It can always find a basic feasible solution with the help of
the slack variables. The objective function is maximized at one of the basic feasible solutions.
Starting with the initial basic feasible solution obtained from the slack variables the simplex
method improves the value of the objective function step by step by bringing in a new basic
variable and making one of the present basic variables non basic. The selection of the new basic
variable and the omission of a current basic variable are performed following certain rules so that
the revised basic feasible solution improves the value of the objective function. The iterative
procedure stops when it is no longer possible to obtain a better value of the objective function
than the present one. The existing basic feasible solution is the optimum solution of the problem
which maximizes objective function.
2.4 Answers for Check your Progress
1. d
2. a
3. Z = 12 where x1 = 4, x2 = 3
2.5 Glossary
2.6 Assignment
Subject to
x1 + 2x2 + x3 ≤ 430
3x1 + 2x3 ≤ 460
x1 + 4x2 ≤ 420
x1, x2, x3 ≥ 0
2. A manufacturer of bags makes three types of bags P, Q and R which are processed on three
machines M1, M2 and M3. Bag P requires 2 hours on machine M1 and 3 hours on machine M2
and 2 hours on machine M3. bag Q requires 3 hours on machine M1, 2 hours on machine M2
and 2 hours on machine M3 and Bag R requires 5 hours on machine M2 and 4 hours on machine
M3. There are 8 hours of time per day available on machine M1, 10 hours of time per day
available on machine M2 and 15 hours of time per day available on machine M3. The profit
gained from bag P is Rs 3.00 per unit, from bag Q is Rs 5.00 per unit and from bag R is Rs 4.00
per unit. what should be the daily production of each type of bag so that the products yield the
maximum profit?
3. Use the simplex method solve the following LPP problem:
Subject to
2x1 + x2 + x3 ≤ 2
3x1 + 4x2 + 2x3 ≤ 8
x1, x2, x3 ≥ 0
2.7 Activities
Subject to
2x1 + 3x2 ≤ 8
2x2 + 5x3 ≤ 10
3x1 + 2x2 +4x3 ≤15
x1, x2, x3 ≥ 0
2. The products A. B and C are produced in three machine centers X, Y and Z. Each product
involves operation of each of the machine centers. The time required for each operation for unit
amount of each product is given below. 100, 77 and 80 hours are available at machine centers X,
Y and Z respectively. The profit per unit of A, B and C is Rs. 12, Rs. 3 and Rs. 1
respectively.Find out a suitable product mix so as to maximize the profit.
A manufacturer of three products tries to follow a policy of producing those which continue most
to fixed cost and profit. However, there is also a policy of recognizing certain minimum sales
requirements currently, these are: Product: x1, x2, and x3. There are three producing
departments. The production times in hour per unit in each department and the total times
available each week in each department are given in the table. The contribution per unit of
product x1, x2, x3 is Rs. 10.50, Rs. 9.00 and Rs. 8.00 respectively. Solve by simplex method.
3.1Introduction
3.1.1Basic Structure of Transportation
3.5Let Us Sum Up
3.7Glossary
3.8Assignment
3.9Activities
3.10Case Study
3.1 Introduction
The transportation problem deals with the distribution of goods from several points of
supply(sources/Origins) to a number of points of demand (destinations).Usuallywe are given the
capacity ofgoods at each source and the requirements at each destination. Basically, the objective
is to minimize total transportation and production costs. Sometimes we deal with maximization
of profit also. This is an iterative procedure in which a solution to a transportation problem is
foundand evaluated using a special procedure to determine whether the solution is optimal.When
the solution is optimal, the process stops.If not, then a new solution is generated. Basic Structure
of a transportation problem isdiscussed with the help of the following example.
Source P Q R S Supply
A 40 45 35 36 300
B 48 50 52 46 200
C 43 44 55 50 400
D 44 50 40 30 400
Demand 250 300 350 400 1300
Consider a manufacturer who operates four factories (Sources) and dispatches his products to
four different retail shops (Destinations). The Table above indicates the capacities (Supply) of
the four factories, the quantity of products required (Demand) at the various retail shops and the
cost of shipping one unit of the product from each of four factories to each of the four retail
shops.
The Table usually referred to as Transportation Table provides the basic data regarding the
transportation problem. The capacity of factories A, B, C, and D is 300, 200, 400, and 400
respectively. The requirement at retail shops P, Q, R, and S is250, 300, 350, and 400
respectively. The prices inside the intersecting cells (Cell AP – Per Unit Transportation cost from
Source A to Destination P) are known as unit transportation cost. So, the cost of transportation of
one unit from SourceA to retail shop P is 40 Rs., FactoryA to retail shop Q is 45 Rs. and so on.
Example I
Solution
Here supply = demand = 100 so go ahead with step 2. First, start with the cell on intersection of
A and P. The row total corresponding to this is 30 and column total at destination P is 20. So,
allocate 20 which is minimum out of two at AP and remaining units are 10 at source A. At the
destination P, requirement has been satisfied so eliminate column P, move horizontally to the cell
AQ. With the supply available at source A being 10 and the demand at Q being 20, allocate
minimum out of two which is 10 at AQ and move further horizontally to cell AR. As no supply
is available at source A, move further to directly to cell BQ where 10 units are left to satisfy.
Allocate 10 units to cell BQ and move horizontally again, at BR now remaining supply being 30
and demand being 25 so allocate 25 at BR. Now again move horizontally at BS, with remaining
units of 5 at source B and with demand of 35, allocate 5 units to cell BS. Again, move
horizontally at CP, CQ and CR where no units are left to allocate. So, by default last 30 units will
be allocated at cell CS.
This is the simple method to use but it starts from north west corner irrespective to looking for
the transportation cost, sometimes highest cost may be considered by the method.
Initial Feasible Solution: NWC Method
Calculate Total Cost = (15*20) + (18*10) + (19*10) + (20*25) + (14*5) + (17*30) = 1750 Rs.
1 First check supply and demand if it is equal, go to step 2 or add dummy row if supply is less
and add dummy column if demand is less.
2 Choose the cell with minimum cost.
3 Consider the supply at source and demand at destination corresponding to that cell and
allocate lower of the two to that cell.
4 Delete the row or column whichever is satisfied by this allocation.
5 If row is deleted, then the column value is revised by subtracting the quantity and column is
deleted then row value is revised.
6 Again, choose the one with least cost from remaining cells, make assignments and adjust row
and column total.
7 Continue until all the units are assigned
Solution
Here supply = demand = 100 so go ahead with step 2. First, select the least cost from whole
matrix which is at cell CP being 13. At CP, supply being 30 and demand being 20 allocate 20
units at CP. Now cut the destination P column as demand has been satisfied. Now again, select
the minimum cost from remaining all values of matrix which at cell BS being 14. At BS, supply
being 40 and demand being 35 allocate 35 units at BS. Now cut the Source C row as supply has
been dispatched fully. Move further, select minimum from remaining values which is at AQ
being 18. Allocate only 10 as demand at Q is only 10 units. Now cut the destination Q column as
demand has been satisfied. Now, out of two remaining values, minimum is 20 at BR, allocate
remaining units of 5 at BR 20 at AR.
Initial Feasible Solution: LCM Method
Source Destination Supply
P Q R S
A 15 18 [10] 22 16 30 20
B 15 19 20 [5] 14[35] 40 5
C 13[20] 16 [10] 23 17 30 10
Demand 20 20 10 25 20 35 100
Calculate Total Cost = (18*10) + (20*5) + (14*35) + (13*20) + (16*10) + (17*30) = 1700 Rs.
1 For each row/column of table, find difference between two lowest costs. (Opportunity
cost/Penalty)
2. Find greatest opportunity cost/Penalty.
3. Assign as many units as possible to lowest cost square in row/column with greatest
opportunity cost.
4. Eliminate row or column which has been completely satisfied.
5. Begin again, omitting eliminated rows/columns. Number of times, the process gets repeated so
it is known as iterative process.
Solution
The highest penalty of 3 occurs at row C, minimum cost in the C row is 13 so intersection of it is
cell CP where allocate 20 units and eliminate column P as demand has been satisfied and 10
units are remaining at source C. Again, repeat step 1and 2 in second iteration (II) with only
remining values of column Q, R and S. The highest penalty at row B, in that minimum cost is 14
so allocate 35 units at cell BS and eliminate column S as demand has been satisfied. Again,
repeat step 1and 2 in third iteration (III) with only remining values of column Q and R. Now,
highest penalty at row C with minimum cost of 16, so allocate 10 units at cell CQ and eliminate
row C as supply has been delivered fully. Still difference can be calculated between remaining
values, so repeat step 1 and 2 in fourth iteration. Highest penalty at row A with minimum cost of
18 so allocate remaining 10 units at cell AQ and eliminate column Q. Now, only one column is
left so no difference can be calculated, hence no iteration is possible, allocate remaining supply
or demand accordingly.
Calculate Total Cost = (18*10) + (22*20) + (20*5) + (14*35) + (13*20) + (16*10) = 1630 Rs.
Note: If there is a tie between two minimum costs, select the one where maximum allocation can
be done. If there is a tie between two least cost as well as maximum allocation, select either of
the two.
Check your progress 1
1.The initial solution of a transportation problem can be obtained by using any of the three
known methods.
However, the only condition is that
(a) the solution be optimal (b)the rim condition are satisfied.
(c) the solution not be degenerate. (d) all of the above.
2. One disadvantage of using North-‐West Corner Rule to find initial solution to the
transportation problem is that
(a) it is complicated to use.
(b)it leads to degenerate initial solution
(c) it does not take into account cost of transportation.
(d) all of the above.
4. The method of finding an initial solution based upon opportunity costs is called __________
5. Find with which initial basic feasible solution method the following solution developed, what
is the total cost of transportation?
TO
FROM
P Q R S Supply
A 12[180] 10[150] 12[170] 13 500
B 7 11 8[180] 14[120] 300
C 6 16 11 7[200] 200
Demand 180 150 350 320 1000
1)Find out a basic feasible solution of the transportation problem using one of the 'three methods
described in the previous section. Check m + n - 1 = number of occupied cells (where m =
number of rows and n= number of columns) condition to apply MODI method first. For every
step of method, it is compulsory to check above condition.
2) Introduce dual variables corresponding to the row constraints and the column constraints. If
there are in origins and n destinations then there will be m+n dual variables. The dual variables
corresponding to the row constraints are denoted byui (I = 1, 2, ….., m) while the dual variables
corresponding to column constraints are denoted by vj (j=1, 2, …….., n).
3)The values of the dual variables should be determined from the following equations. Values
can be calculated only with the help of occupied cells.
Ui + vj = cij
One of the dual variables can be chosen arbitrarily. It is to be also noted that as the primal
constraints are equations, the dual variables are unrestricted in sign. Any positive or negative
number can be selected but it is always good to allocate zero with no sign. The best way to
assign zero is to select a row or column where maximum number of occupied cells are located.
4) Now find the opportunity costs of each unoccupied cells (The cells where no allocation has
been made) with the help of following formula:
Δij = Cij – (ui + vj)
If any value is negative that means there a scope of reducing transportation cost by that much of
rupees per unit.
Let us consider the following transportation problem given in Example 2 with a basic
feasiblesolution computed by least cost method,
Example II
Total Cost TC = (20*5) + (17*2) + (40*7) + (60*3) + (12*8) + (25*10) = 940 Rs.
Step 2 and 3. The dual variables can be calculated as follows by putting zero in P1 row. (by
considering only occupied cells)
u1 + v1 = 20, u1 + v4 = 17, u2 + v4 = 43, u3 + v4 = 8
u3 + v2 = 4 u2 + v3 = -3,
P1D2 20 – (0 +20) = 0
P1D3 50 – (0 + (-3)) = 47
P2D1 70 – (43 + 20) = 7
P2D2 35 – (43 + 4) = -12
P3D1 40 – (8 + 20) = 12
P3D3 60 – ( 8 + (-3)) = 55
Cell P2D2 negative value shows that cost reduction is possible by 12 Rs. Per unit.
Closed loop (Shown in non-optimal solution with signs) always starts with selected unoccupied
cell with plus sign. Except beginning cell all other cells are always occupied. Starting sign is
always plus followed by minus, plus, minus so on. and end up wi from P2D2 is shown in above
non-optimal solution. For shifting the units, consider the cells with negative signs and select the
minimum value. Here negative sign cells have allocation of 3 and 8 units so select 3 units to
shift. Shift the 3 units according to the sign, where plus sign add 3, where minus sign subtract 3.
So new solution will be as below:
Now, again check opportunity costs of each unoccupied cell as explained above, if all
opportunity costs are zero or greater than zero then, it is an optimal solution.
P1D2 30 – (0 +4) = 26
P1D3 50 – (0 + 9) = 41
P2D1 70 – (31 + 20) = 19
P2D4 60 – (31 + 17) = 11
P3D1 40 – (8 + 20) = 12
P3D3 60 – ( 8 + 9) = 43
Total cost TC = (20*5) + (17*2) + (35*3) + (40*7) + (12*5) + (25*13) = 904 Rs.
In a transportation problem, the total demand of destinations must be identical to the total
capacity of sources, otherwise it cannot be solved.
2. In vogel’s approximation method the differences of the smallest and second smallest
costs in each row and column are called ______.
When supply and demand is not equal it is known as unbalanced transportation problem. To
make it balanced, add dummy row with zero cost in each cell if supply is less and add dummy
column with zero cost in each cell if demand is less. Following example will make procedure
clear:
Example III
A B C Supply
X 9 11 10 40
Y 10 8 12 60
Z 12 7 8 50
Demand 50 40 30 120 / 150
Solution
Here supply is 150 and demand is 120, so demand is less by 30 units, as demand is less, we will
add dummy column with D destination with zero transportation costs as actually it does not
contribute in total transportation cost. If supply is less, add dummy row with zero transportation
cost. Solution for the same is shown in table below:
A B C D Supply
X 9 11 10 0
0 40
Y 10 8 12 0 60
0
Z 12 7 8 0 50
Demand 50 40 30 30 150
If opportunity costs of all unoccupied cells are positive, it is an optimal solution. However, when
one of the opportunity costs is zero that means other transportation schedule is possible without
increasing or decreasing total transportation cost. So, the unoccupied cell where opportunity cost
is zero, units can be shifted according to the rule of closed loop and that will be the another
transportation schedule with similar transportation cost.
3.4.3Degeneracy
A basic feasible solution of a transportation problem has m+n-1 basic variables, which means
that the number of occupied cells in such a solution is one less than the number of rows plus the
number of columns, It may happen sometimes that the number of occupied cells is smaller than
m+n-1. Such a solution is called a degenerate solution.
Degeneracy in a transportation problem can figure in two ways:
1) While obtaining Initial feasible Solution
2) While Revising the solution
When a solution is degenerate, the difficulty is that it cannot be tested for optimality.
k+ ϵ=k; k- ϵ= k; 0+ ϵ= ϵ;
ϵ + ϵ = ϵ; ϵ- ϵ =0; k* ϵ =0.
Example IV:
A company wants to ship loads of his product shown below. The matrix shows the kilometers
from sources of supply to the destination.
Shipping cost is Rs. 10/Load per km. what shipping schedule should be used to minimize total
transportation cost?
Solution:
Since total destination requirement of 25 units more than the total resources capacity of 22. This
excess requirement is handled by adding dumny plant Sexcees with a capacity equal to 3 unit. We
use zero transportation cost to the dummy plant.
Then modified total is shown below:
In order to remove degeneracy, we assign Δ to unoccupied cell (S2, D5) which has minimum cost
among unoccupied cells as shown in table 2.
We use MODI method and therefore first we have to find ui, vj &Δij with following relation.
cij = ui + vj for occupied cell
Δij = cij – (ui + vj) for unoccupied cell.
Here some Δij is not greater or equal to zero. This is not an optimal solution. Then we have to
improve this solution for this we have to choose (Sexcess D3) cell because it has largest negative
cost it must enter the basesThen we choose a closed path for cell (Sexcess D3) which is (Sexcess,
D3)→(Sexcess,D4)→(S2D4)→(S2,D5) →(S1D5)→(S1D3)→ (D4Sexcess)and, min. (Δ,3,5) = Δ. The
new solution is shown in following table 4:
Here again some Δij, is not greater or equal to zero. Then this is not an optimal solution. Then
again we choose (S3D4) cell which is largest negative, it must enter the basis and choose a closed
path as (S3 D4)→(S3D5)→(S1D5)→(S1D3)→(SexcessD3)→(SexcessD4)→(S3D4). Here min (3, 5) = 3
and find a solution which is shown in following Table 6.
Again, we check optimality & calculate ui vj &Δij as follows:
Again (S3D3) < 0 therefore this is not optimal solution again we choose (S3D3) cell enter into
basis and mark a closed path as (S3D3)→(S3D5)→(S1D5)→(S1,D3)→(S3D3) and modified this
table as shown below as Table 8.
Again, we check optimality for this we calculate µi, vj &Δij as follows:
Example V
Goods have to be transported from sources S1, S2 and S3 to destinations D1, D2 and D3. The
transportation cost per unit capacities of the sources and requirements of the destination are
given in the following table.
Solution:
To find initial Basic feasible solution. Using north- west corner method. The non-degenerate
initial basic feasible solution is given is Table 1.
Therefore, there is no degeneracy. To test the optimality. We use MODI method, for this first we
calculate µi, vj &Δij.
Since the unoccupied cell (S3, D1) has the largest negative opportunity cost of the therefore cell
(S3, D1) is entered into the basis. Then we have chosen closed path(S3,D1)→(S3D2)→
(S2D2)→(S2D1)→(S3D3). Here maximum allocation to negative cell is 300.So, modified solution
is given below:
But in this solution degeneracy occur because total no of positive allocation become 4 which is
less than the required no m + n – 1 = 3 + 3 – 1 =5
Hence this is degenerate solution, to remove degeneracy a quantity Δ assigned is to one of the
cells that has become unoccupied so that m + n-1 occupied cell assign Δ to either (S1,D1) or (S3,
D2) and proceed with the usual solution procedure.
Again, proceed with the usual solution procedure. The optimal solution is given as follows: with
total transportation cost = 1900 Rs.
In the most general form, a transportation problem has a number of origins and a number of
destinations. A certain amount of a particular shipment is available in each origin. Likewise,
each destination has a certain requirement/demand. The transportation problem indicates the
amount of shipment to be transported from various origins to different destinations so that the
total transportation cost is minimized without violating the availability constraints and the
requirement constraints. The number of techniques is available for computing an initial basic
feasible solution of a transportation problem. These are the North West Corner rule, Least Cost
method and Vogel's Approximation Method (VAM). Optimum solution of a transportation
problem can becalculated from Modified Distribution (MODI) Method. Sometimes the total
available supply at the origins is different from the total demand at the destinations. Such a
transportation problem is said to be unbalanced. An unbalanced transportation problem can be
made balanced by introducing an additional dummy row or column with zero transportation;
cost. The basic feasible solutions of a transportation problem with m origins and n destinations
should have m+n - 1 positive basic variables. However, if basic variables are less than m + n – 1,
the solution is said to be degenerate. A degenerate transportation problem can be modified by
adding n epsilon at independent cell.
1. b
2. c
3. False
1. True
2. Penalty
3. m + n -1
3. False
3.7 Glossary
ADegenerate Transportation Problem: with in origins and n destinations has a basic feasible
solution with fewer than m+n - 1 positive basic variables.
3.8 Assignment
1. Find an initial basic feasible solution to the following transportation problem. Is it an optimal?
Use VAM &MODI method.
D1 D2 D3 D4 Available Units
O1 5 4 2 1 130
O2 2 3 7 5 100
O3 5 4 5 6 30
Demand 40 50 70 100
2.Mr.ContractorisabuilderandownerofAshianaConstructionCompany.Currentlyhehasthreelargeho
usingprojectsinhand.TheyarelocatedatAndheri,BandraandChinchwad.
HeprocurescementfromfourplantslocatedatDumdum,Ellora,FerozaandGuna.The basic
feasiblesolution asdetermined byNorth WestCorner rule isgiven below:
Projects A B C Availability
Plants
1 2[50] 7 4 50
2 3[20] 3[60] 1 80
3 5 4[30] 7[40] 70
4 1 6 2[140] 140
Demand 70 90 180 340
Mr.Contractorwantstoplanmovementofcementinsuchamannerthattheoptimalminimumtransp
ortation cost is reached. Assisthim.
3. A company has three plants and four warehouses. The supply and demandin units and the
corresponding transportation costs are given. Below table shows initial solution of problem.
Warehouses
I II III IV
Plants Supply
10
1
5 10 4 5 10
20 5
2
6 8 7 2 25
5 10 5
3
4 2 5 7 20
Demand 25 10 15 5 55
Answer the following questions, giving brief reasons:
(a) Is this solution degenerate?
(b) Is this solution optimal?
(c) Does this problem have more than one optimal solution? If so,
show all of them.
4.A company has three plants and three warehouses. The supply and demand in units and the
corresponding transportation costs are given. Below table shows initial solution of problem.
Find an optimal solution.
5.A product is produced by four factories A, B, C, and D. Per unit production costs are Rs.
2, Rs. 3, Rs. 1 and Rs. 5 respectively. The production capacitiesof A, B, C and D are 50,
70, 30 and 50 units respectively. These factoriessupply the product to four stores I, II, III
and IV with a demand of 25, 35, 105,and 20 units respectively. Per unit transportation cost
in rupees are given in thetable below:Determine the extent of deliveries from each of the
factories to each of the stores so that the total cost (production and transportation cost) is
minimum.
Stores
I II III IV
Factory A 2 4 6 11
Factory B 10 8 7 5
Factory C 13 3 9 12
Factory D 4 6 8 3
3.9 Activities
XYZ shipping corp. is a leading shipping corporation of the nation. They have
officesinMumbaiandGandhidham.Theyprovideservicesto
differentcompanyandtransporttheirgoodstowarehousesto marketplaces. The following table
provides all necessary information on the availability of supply of each warehouse to the
requirement of the various markets. And the unittransportation cost in thousand Rs from each
warehouse to each market is mentioned below.Mr Sanjay, the shipping clerk of ABG shipping
agency usually prepares schedule of transportation based on his expertise and vast experience.
Mr. Sanjay has worked out the following schedule on the basis of assumptions.12 units from A
to Q, 1 unit from A to R, 9 units from A to S, 15 units from B to R, 7 units from C to P, 1 unit
from C to R
Markets
Warehouse P Q R S Supply
A 6 3 5 4 22
B 5 9 2 7 15
C 5 7 8 6 8
Demand 7 12 17 9 45
a) Being a consultant of the company, check and analyze wetherMr Sanjay has arranged optimal
schedule or not? You can apply transportation method.
b) Find the optimal schedule and minimum total transportation cost whether this problem has
only one optimal solution or not? Justify your answer.
4.1 Introduction
4.1.1Basic Structure of Assignment
4.6 Glossary
4.7 Assignment
4.8 Activities
4.1 Introduction
The assignment problem in the general form can be stated as follows : Given m number of
facilities and n number of jobs and the effectiveness of each facility for each job, the problem is
to assign each facility to one and only one job in such a way that the measure of effectiveness is
optimized (Maximized or Minimized).Several problems of management may havean application
of assignment problem. A project manager may have five people available for assignment and
five projects to fill. He is in interest ofknowing which job should be assigned to which person so
thatall project tasks may be accomplished in the shortest possible time. Likewise, an institute
may have different subjects to offer different faculties, the duty is to assign subjects in a such
way that faculty may be able to complete within short period of time with efficiency. In a
marketing set up by making an estimate of sales performance for different salesmen as well as
for different territories one could assign a particular salesman to a particular territory with a view
to maximize overall sales.It may be noted that with n facilities and n jobs there are n! possible
assignments. One way of finding an optimum assignment is to write all the n! possible
arrangements, evaluate their total cost (in terms of the given measure of effectiveness) and select
the assignment with minimum cost. The method leads to a lengthy computational process.
Henceit is necessary to develop a suitable computation procedure to solve an assignment
problem.
Consider this example to understand basic structure of an assignment. Given the following cost
table for an assignment problem. Here the important condition is to have rows and columns are
same. The assignment should be always one to one. More than one machine cannot be assigned
to more than one jobs.
Operator Machine
A B C D
1 10 2 8 6
2 9 5 11 9
3 12 7 14 14
4 3 1 4 2
4.2Assignment – Hungarian Assignment Method
4.2.1 Algorithm of Hungarian Assignment Method
Step 1: Find out the cost table from the given problem. If the number of origins are not equal to
the number of destinations, a dummy origin or destination must be added with zero cost.
Step 2: Find the smallest cost in each row of the cost table. Subtract this smallestcost element
from each element in that row. Therefore, there will be at-least one zero in each row of this new
table, called the first Reduced Cost Table.
Step 3: Find the smallest element ineach column of the reduced cost table. Subtract this smallest
cost element from each element in that column.As a result, each row and column now has at-
least one zero value in the second reduced cost table.
Step 3: Draw the minimum number of horizontal and vertical lines that can cover maximum
zero.
Step 4: Number of drawn horizontal and vertical lines must be equal to number of rows and
columns. If both are same go to step 5 or go to step 6.
Step 5: Start to check with the first row and made first assignment where there is a single zero.
Cut the zero in respective column. Repeat the procedure till assignment is made for all jobs. An
optimal assignment is found, if the number of assigned cells equals the number of rows (and
columns).
Step 6: Examine those elements that are not covered by a line. Choose the smallest of these
elements and subtract this smallest from all the elements that do not have a line through them
and add this smallest element to every element that lies at the intersection of two lines. The
resulting matrix is a new revised cost tableau. Repeat the step until number of rows and columns
are equal to drawn horizontal or vertical lines.
Step 8: Calculate total cost or profit with reference to the original matrix.
Example I
let us assume that Geeta is a sorority pledge coordinator with four jobs and only three pledges.
Geeta decides that the assignment problem is appropriate except that she will attempt to
minimize total time instead of money (since the pledges aren’t paid). Geeta also realizes that she
will have to create a dummy fourth pledge and she knows that whatever job gets assigned to that
pledge will not be done (this semester, anyhow). She creates estimates for the respective times
and places them in the following table:E is, of course, a dummy pledge, so her times are all zero.
Solution of Example I
(a) The first step in this algorithm is to develop the opportunity cost table. This is done by
subtracting the smallest number in each row from every value in that row, then, using these
newly created figures, by subtracting the smallest number in each column from every other value
in that column. Whenever these smallest values are zero, the subtraction results in no change.
Job 1 Job 2 Job 3 Job 4
B 1 6 0 5
C 5 6 0 4
D 0 1 2 4
E 0 0 0 0
No change was produced when dealing with the columns since the smallest values were always
the zeros from row four.
(b) The next step is to draw lines through all of the zeros. The lines are to be straight and either
horizontal or vertical. Furthermore, you are to use as few lines as possible. If it requires four of
these lines (four because it is a 4 4 matrix), an optimal assignment is already possible. If it
requires fewer than four lines, another step is required before optimal assignments may be made.
In our example, draw a line through: row four, column three, and either column one or row three.
(c) Since the number of lines required was less than the number of assignees, a third step is
required (as is normally the case). Looking at the version of the matrix with the lines through it,
determine the smallest number not covered by a line. Subtract this smallest number from every
number not covered by a line and add it to every number at the intersection of two lines.
Job 1 Job 2 Job 3 Job 4
B 0 5 0 4
C 4 5 0 3
D 0 1 3 4
E 0 0 1 0
Draw the minimum number of lines to cover all the zeroes, and we have the matrix below.
Job 1 Job 2 Job 3 Job 4
B 0 5 0 4
C 4 5 0 3
D 0 1 3 4
E 0 0 1 0
Since only 3 lines are needed to cover the zeroes, we determine the smallest number not covered
by a line. Subtract this smallest number from every number not covered by a line and add it to
every number at the intersection of two lines. The result is shown with the new lines drawn
through the zeroes.
Job 1 Job 2 Job 3 Job 4
B 0 4 0 3
C 4 4 0 2
D 0 0 3 3
E 1 0 2 0
(d) Since this matrix requires four lines to cover all zeros, we have now reached an optimal
solution stage.
(e) In our example the assignments must be: C to job 3 = 2 , B to job 1 = 4, D to job 2 = 4 and E
to job 4 = 0. Since E is a dummy row, the job labeled job 4 does not get completed. So, the total
time is 10.
Check your progress 1
1. An optimal of an assignment problem can be obtained only if
(a) each row and column has only one zero element
(b) each row and column has at least one zero element
(c) the data are arrangement in a square matrix
(d) non of the above
2. In an assignment problem,
(a) one agent can do parts of several tasks
(b) one task can be done by several agents
(c) each agent is assigned to its own one best task
(d) none of the above
3. Number of drawn lines are not equal to number of rows and columns, eventhogh optimal
solution can be found out. State true or false.
4. The procedure used to solve assignment problems wherein one reduces the original
assignment costs to a table of opportunity costs is called __________.
4.3.1 Unbalanced Assignment Problem: When number of rows and columns are not same, it
the case of unbalanced transportation problem. There is a need of adding dummy row or column
whichever is less. In example I, we have added dummy row with zero cost to solve the problem
of unbalanced assignment problem.
In a production unit four new machines M1, M2, M3 and M4 are to beinstalled in a
machineshop. There are five vacant places A, B, C, D and Eavailable. Because of limited space,
machine M2 cannot be placed at C and M3cannot be placed at A. The cost of locating a machine
at a place in thousandsof rupees is as under:
A B C D E
M1 4 6 10 5 6
M2 7 4 - 5 4
M3 - 6 9 6 2
M4 9 3 7 2 3
4.3.3Multiple Optimal Solutions: When during final assignment there is no single zero found in
any row or column, that can be the case of multiple optimal solutions. It might possible one row
and one column both have two zeros, arbitrarily start with any zero of row or column and find
the assignment, similar way again start with the second zero of row or column. Single zero row
or column assignment will remain same in both the solutions. Following example will make it
clear
Example II
Consider the following assignment problem: The Spicy Spoon restaurant has four payment
counters. There are four persons available for service. The cost of assigning each person to each
counter is given in the following table.Assign one person to one counter to minimize the total
cost.
Person 1 2 3 4
A 1 8 15 22
B 13 18 23 28
C 13 18 23 28
D 19 23 27 31
Solution of Example II
After applying steps 1 to 3 of the Hungarian Method, we obtain the following matrix.
Person 1 2 3 4
A 0 3 6 9
B 0 1 2 3
C 0 1 2 3
D 0 0 0 0
Person 1 2 3 4
A 0 2 5 8
B 0 0 1 2
C 0 0 1 2
D 1 0 0 0
The resulting matrix suggest the alternative optimal solutions as shown in the following
Option 1
Person 1 2 3 4
A 0 2 4 7
B 0 0 0 1
C 0 0 0 1
D 2 1 0 0
Option 2
Person 1 2 3 4
A 0 2 4 7
B 0 0 0 1
C 0 0 0 1
D 2 1 0 0
4.3.4Maximization types of problems: First select the maximum value from whole matrix,
subtract all other values from that maximum value, the new matrix is known as revised cost
matrix. Then apply Hungarian assignment method on developed table as explained in example I.
following example will make it clear.
A company has four sales representatives who are to be assigned to four differentsalesterritories.
The monthly sales increase estimated for each sales representative for different sales
territories(in lakh rupees), is shown in the following table:Suggest optimal assignment and the
total maximum sales increase per month.
2. Solve the following assignment problem so as to minimize the time (in days) required to
complete all the task.
person task
T1 T2 T3 T4 T5
A 6 5 8 11 16
B 1 13 16 1 10
C 16 11 8 8 8
D 9 14 12 10 16
The Assignment Problem considers the allocation of a number of jobs to a number of persons so
that the total completion time or cost is minimized or total profit is maximized. If the number of
persons is-the same as the number of jobs, the assignment problem is said to be balanced. If the
number of jobs is different from the number of persons the assignment problem is said to be
unbalanced. An unbalanced assignment problem can be converted into a balanced assignment
problem by introducing a dummy person or a dummy job with completion time zero.
Though, an assignment problem can be formulated and solved as a linear programming problem,
it is solved by a special method known as Hungarian Method. If the times of completion or the
costs corresponding to every assignment is written down in a matrix form, it is referred to as a
Cost matrix. The original cost matrix can be reduced to another cost matrix by following the
steps of algorithm. Different cases of the assignment problem are possible. If a person is unable
to carry out a particular job the corresponding- cost or completion time is taken as very large
which automatically prevents such an assignment. The multiple optimal solutions are also
possible with same cost or profit.If the objective is to maximize a performance/profit through
assignment, Hungarian Method can be applied to a revised cost matrix obtained from the original
cost matrix.
1. b
2. c
3. False
1. False
2. A – T2 = 5, B – T4 = 1, C – T3 = 8, D – T1 = 9, E – T5 = 0
Total Time = 5 + 1 + 8 + 9 + 0 = 23 Days
4.6 Glossary
Assignment Problem:is a special type of linear programming problem where theobjective isto
minimize the cost or time of completing a number of jobs by a number of persons.
A Dummy Job:is an imaginary job with cost or time zero introduced to make
anunbalanced assignment problem balanced
4.7 Assignment
Machine M1 M2 M3 M4 M5
Man
A 3 2 7 4 8
B 5 4 3 8 5
C 3 7 9 1 2
D 4 2 6 5 7
E 2 8 4 6 6
2.Fix-
ItShophasreceivedthreenewrushprojectstorepair:aradio,atoasteroven,andbrokencoffeetable.Three
repairpersons,eachwithdifferenttalentsandabilities,are availabletodothejobs.TheFix-
itShopownerestimatesthatitwillcostinwages
toassigneachoftheworkerstoeachofthethreeprojects.Thecosts,showingthetablediffer
becausetheownerbelievesthateachworkerwilldifferinspeedandskillonthesequite
variedjobs.Theowner’sobjectiveistoassignthethreeprojectstotheworkersinaway thatwill result in
lowest totalcost to theshop.What is theoptimalassignment?
Project
Person 1 2 3
Adams 11 14 6
Brown 8 10 11
Cooper 9 12 7
S1 S2 S3 S4 S5
B1 4 6 7 5 11
B2 7 3 6 9 5
B3 8 5 4 6 9
B4 9 12 7 11 10
B5 7 5 9 8 11
Find the optimum assignment of products on the setup resulting in the minimum cost.
4.8 Activitiy
An airline that operates 7 days a week has the timetable as given below. Crewmust have a
minimum layover of 5 hours between flights. Obtain the pairing of flights that minimize layover
time away from home assuming that the crew can be based at either of the two cities. Suggest an
optimum assignment of crew that result in small layover
Delhi – Jaipur -
Jaipur Delhi
Flight No. Depart Arrive Flight No. Depart Arrive
1 7.00 am 8.00 am 101 8.00 am 9;15 am
2 8.00 am 9.00 am 102 8;30 am 9;45 am
3 1.30 pm 2.30 pm 103 12 Noon 1.15 pm
4 6.30 pm 7.30 pm 104 5.30 pm 6.45 pm
In this block, we discussed widely used techniques of operation research in detail for optimal
output in terms of maximum profit or minimum cost with available resources. In the first unit,
special technique of operations called linear programming problems was explained. Based on the
number of products and resources available, formulation and solution for only two decision
variables were discussed through graphical method. Special problems like infeasibility and
unboundedness of LPP were discussed. In the second unit, the technique of simplex method was
discussed for two or more decision variables. In the third unit, the concept of transportation was
discussed with initial basic feasible solution methods and optimal solution method. Special cases
of transportation like unbalanced, multiple optimal solutions and degeneracy were explained. In
the last unit, the method of assignment was explained to find the effectiveness of assigning jobs
to each facility along with special cases like unbalanced assignment, prohibited, multiple optimal
solutions and maximization types of assignment problems.
Block Assignment
Schedule
Attendant A B C D E F
1 7 4 6 10 5 8
2 4 5 5 12 7 6
3 9 9 11 7 10 8
4 11 6 8 5 9 10
5 5 8 6 10 7 6
6 10 12 11 9 9 10
TO
From 1 2 3 4 Supply
1 500 750 300 450 12
2 650 800 400 600 17
3 400 700 500 550 11
Demand 10 10 10 10
(a) Find the initial solution using the northwest corner method, the minimum cell cost
method, and Vogel’s approximation model. Compute total cost for each.
(b) Using the VAM initial solution, find the optimal solution using the modified distribution
method (MODI).
Block Introduction
In this block, some more operation research techniques will be discussed. In the first unit
situation related to planning, scheduling and controlling of projects will be discussed. The
process of developing network diagrams and finding project completion time will be covered. In
the second unit the nature and scope of waiting line concept will be discussed. Some basic
waiting line models and their application will also be covered. In the last unit the concept and
scope of game theory will be discussed. The consequences of interplay of combination of
strategies with competitor and methods employed to derive the optimal strategy will be covered
Block Objective
• Understand situations related to planning , scheduling and controlling of projects
• Develop simple network diagrams with activities.
• Identify the critical path and compute the project completion time
• Compute Slack and float
• Estimate the probability of project completion on a desired date
• Understand the nature and scope of waiting line system
• Describe the characteristics and structure of waiting line system
• Understand the application of statistics in solving waiting line problems
• Apply common waiting line models in suitable business problems
• Determine the optimum parameters of queuing models
• Understand the concept and scope of game theory
• Understand the consequences of interplay of combination of strategies with competitor
• Distinguish between different type of game situations
• Analyse and derive the optimal strategy in a game
• Understand the rule of dominance for solving game problems.
Block Structure
Unit 1 Project Scheduling-PERT/CPM
Unit 2 Waiting Line Models
Unit 3 Game Theory
Unit No. 1 Project Scheduling –CPM/PERT
_________________________________
Unit Structure
1.0 Learning Objectives
1.1Introduction
1.2PERT/CPM network
1.2.1 Key Concepts
1.2.2 Rules for Network Construction
Check your progress 1
1.7Glossary
1.8 Assignment
1.9Activities
1.10Case Study
1.11Further Reading
1.0Learning Objectives
1.1Introduction
Both PERT and CPM techniques use similar terminology and is used for similar purpose;
however they were developed independently of each other, in late 1950’s.. PERT was developed
and used for planning and designing of Polaris Submarine system. The CPM on other hand was
developed by the Du Pont Company and Univac of Remington Rand Corporation as a device to
control the maintenance of chemical plants. The basic difference between the two techniques is-
PERT is useful for project scheduling problems where the completion time of different activities
is not certain and CPM is used in situations where the activity durations are known with
certainty. In CPM not only the amount of time needed to perform various task, but also the
resources required to perform each of the activities are assumed to be known. This technique is
basically concerned with obtaining the trade-off between the project duration and cost. So
variation in project time is inherent with PERT while in CPM it can be systematically varied by
using additional resources. Basically it can be said that PERT is probabilistic in nature and CPM
is deterministic.
Today’s computerized versions of PERT and CPM techniques combine the best features of both
approaches. Thus the distinction between the two techniques is no longer necessary. So in this
unit we will refer project scheduling techniques as PERT/CPM.
1.2.1Key Concepts
Activity: An operation or task which utilizes resources and consumers time is known as an
activity. An activity is represented by a single arrow, also called as arc in the project network.
The head of the arrow shows the sequence or flow in which activities are to be done. The activity
arrow is not scaled and the length of the activity arrow is a matter of convenience and clarity and
is not related to the time required by the activity. All activities should be defined properly, so
that its beginning and end can be identified clearly. A project consists of several activities. For
example, construction of the house involves many activities like- getting finance, building
foundation, Order and receiving materials, building house, selecting paint, selecting furnishings,
painting, finishing work etc.
Event: An event is called the beginning and completion of activity. They are points in time and
can be considered as milestones. An event in a network is represented by a circle. The events are
also called as nodes. The difference between activity and even is that an activity is a
recognizable part of the project, involving physical and mental work and requiring time and
resources for its completion, whereas an event is an accomplishment at a point of time which
neither requires time nor consumers resources.
Predecessor Activity: An activity which should be completed immediately prior to the start of
another activity is called Predecessor activity
Successor Activity: An activity which cannot be started, until the completion of one or more
activities is called successor activity
Concurrent Activity: Activities that should be done simultaneously are called concurrent
activity. It should be note that an activity can be predecessor or successor to an activity and may
be concurrent with one or more activity.
Dummy Activity: A dummy activity is an activity, which doesn’t consumer any time or
resource. It is an imaginary activity that does not exist in project activities. A dummy activity is
needed when:
1. Two or more activities in a project have identical immediate predecessor and successor
activities.
2. Two or more activities have some (and not all) of their predecessor activities in common.
Dummy activities are usually shown by arrows with dashed lines. To illustrate, in Fig 1, we have
a situation in which both the activities A and B have the same start and end events .It is incorrect
to represent the activities A and B, as shown in Part (i) because 1-2 is used to represent either A
or B. It is against the rule of assigning unique numbers to activities for the purpose of
identification.
2
1 2
1 3
(i) (ii)
By introducing a dummy activity, the activities A and B can be identified as 1-2 and 1-3
respectively as shown in Part (ii). Thusin situations where two or more activities have the same
beginning and end events, a dummy activity is introduced to resolve the problem
There are number of concepts and rules which should be followed in dealing with activities and
events, when making a network. It helps to develop a correct structure of the network.
1. Each activity is represented by one and only one arrow in the network. Therefore no
single activity can be represented twice in the network.
2. Events are identified by numbers. The number given to an event should be higher than
that is allotted to the event immediately preceding.
3. The activities are identified by the numbers of their starting and ending notes
4. Paralleling activities between two events are prohibited. Thus, no two activities can have
the same start and end events.
5. Before an activity can be undertaken, all activities preceding it must be completed.
6. Dangling must be avoided in a network. It means an event which is not connected to
another event by an activity. An activity is merging into an event, but no activity is
starting or emerging from the event. Thus the event becomes detached from the network
Check your Progress 1
1. PERT stands for program enterprise and resource technique. (True/False)
2. A dummy variable is an activity inserted into the AOA network diagram to show a
precedence relationship, but does not represent any passage of time. (True/False)
3. Unlike PERT, CPM incorporates probabilistic time estimates into the project
management process. (True/False)
4. An activity which should be completed immediately prior to the start of another activity
is
a. Successor activity
b. Predecessor activity
c. Dummy activity
d. Concurrent activity
5. An activity which cannot be started, until the completion of one or more activities is
called successor activity
a. Successor activity
b. Predecessor activity
c. Dummy activity
d. Concurrent activity
The first step in PERT/CPM scheduling process is to develop a list of all the activities that
comprise a project and the interdependence relationship. Let us take an example of construction
of a commercial complex. First we need to prepare plan of the complex. Next we may prepare
prospectus and start looking for potential tenants. A contractor should to be selected and building
permits should be prepared and approval should be obtained. Then the construction can be done.
Lastly the contracts can be finalized with tenants and they can move in. In this project, the
various activities required to be performed along with the time needed for execution are given in
Table1.
Table 1: Construction of commercial complex
Activity Description Duration Immediate Predecessor
A Prepare Plan of the commercial complex 5 -
B Develop prospectus for tenants 4 A
C Identify the potential tenants 6 B
D Select contractor 3 A
E Prepare building permits 1 A
F Obtain approval for building permits 4 E
G Perform Construction 14 D,F
H Finalize contracts with tenants 12 B.C
I Tenants move in 4 G,H
Note that this table contains information about immediate predecessors. The immediate
predecessors for a particular activity are those that must be completed immediately before this
activity may start. For example, before we can start on the activity A-Building the Plan of the
commercial complex, at any time as this the first activity. However Activity B can be started
once activity A is completed. Activities B, D, E, can be started, only after completing Activity A.
In the same way rest of the information in the table can be understood.
Once the activities comprising a project and the interdependency relationship among them is
clearly identified, they can be portrayed graphically using a network or an arrow diagram. As
earlier explained, the arrows in a project network represent various activities in a project. Along
with each arrow the description and duration of the activity is represented. The circles at the
beginning and at the end of the arrow represent the nodes or the events.
Activity A has no predecessor activity, as it is the first activity. Let us assume that activity ‘A’
starts at node 1 and ends at node 2. It is represented graphically as below:
A
2
Next activities B, D and E, have a precedence of A, so all the activities will start at the end node
of A. Let us demonstrate:
3
B
A D
2 5
1
Asactivity C has a precedence of B, it will start at node 3. Similarly activity F will start at node
4. However as activity G has a precedence of two events D and F, activity will end on 5.
C
3 6
B
A D
1 2 5
E F
4
Similarly rest of the precedence relationship can be followed and the final network can be
developed. This figure depicts the project network for constructing the commercial complex.
C
3 6
B H
A D G I
1 2 5 7 8
E F
We have earlier discussed the concept of dummy activity. A dummy activity is an imaginary
activity, which does not require any resource or consume time. It is required when: (a)Two or
more activities in a project have identical immediate predecessor and successor activities or(b)
Two or more activities have some (and not all) of their predecessor activities in common. Let us
take an example to understand the use of dummy activity in constructing a network.
Illustration: The table 2 gives the activities involved in construction of a house. Develop a
project network
Table 2 – Construction of a house
Activity Description Duration Immediate Predecessor
A Design House 3 -
B Lay foundation 2 A
C Order and receive materials 1 A
D Build house 6 B,C
E Select paint 1 B,C
F Select furnishings 1 E
G Finish Work 3 D,E
The first activity is A, with no precedence and activity B and C have precedence of A. This can
be represented as:
B 3
A
1 2 4
Both activity D and E, have activity A as predecessor and activities B & C as successor. A
dummy is required when two or more activities have identical immediate predecessor and
successor activities. Hence a dummy is required in this step, which can start either at end of
activity B or C.
A C D
1 2 4 6
3
B
A C D G
1 2 4 6 7
E F
5
1.3.2 The Concept of Critical Path
To determine the project completion time, we have to analyse the network and identify what is
called the critical path of the network. Let us first understand the concept of a path. A path is
sequence of connected nodes that leads from the start node to finish node. The longest path of the
network is called the critical path. Identifying the critical path of a network is the very important
as it determines the project completion time. If any activity on the critical path is delayed, whole
project will be delayed. There can be multiple critical paths, if there is a tie among the longest
paths. To understand the concept of critical path and project completion let us consider the
earlier example given in Table 1
C(6)
3 6
B(4) H(12)
E(1) F(4)
In the above network, the time estimates are mentioned within bracket along with the activity
name on the arrow. There are three possible paths for this network. For this simple network, the
critical path can be found by enumerating all the possible paths. These paths are listed below:
Path Length
31
(i) A→B→C→H→I
26
(ii) A→D→G→I
(iii) A→E→F→G→I 28
The first path (A→B→C→H→I) is the critical path, as it takes the longest period of time to
complete i.e 31 months. For this network the project completion time will be 31 months. The
activities on the critical path are known as critical activities, as delay in any one of them can
delay the entire project. In other words there is no slack time in the activities on the critical path.
Slack time is the time an activity can be delayed without delayed the project.
For a small network it is simple to list all the possible paths and compare to find the critical path.
As the number of activities increases, the network becomes complex and finding the critical path
by enumerating all path becomes time consuming. Therefore there is a need to develop a
systematic approach to find the critical path. These computations involve a forward and a
backward pass through the network. The forward pass calculation begins, at the start event and
moves to the end event of the project network, i.e. from left to right of the network. The
backward pass calculation begins at the end event and moves to the start event of the network, i.e
from right to left of the network event.
1.3.3 Determination of Earliest start and Earliest Finish Times- Forward pass
The earliest start (ES) time indicates the earliest that a given activity can be scheduled and
earliest finish (EF) times indicates the time which the activity can be completed, at the earliest.
To begin with, each of the activities initiated at the starting node is assumed to start at time ‘0’.
The earliest finish time for each activity is obtained by adding the activity time to the ES time.
The formula of EF is:
𝐸𝐹 = 𝐸𝑆 + 𝑡
𝑤ℎ𝑒𝑟𝑒, 𝑡 𝑖𝑠 𝑡ℎ𝑒 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑡𝑖𝑚𝑒
In our example, activity A is the first activity and therefore will start at ‘0’ time. As the duration
of the activity A is 5 months, so its EF time will be 0+5=5. Now all the subsequent activities are
assumed to start as soon as possible, that is as soon as all of their respective predecessor
activities are completed. For a given activity, the ES would be taken as the maximum of the EF’s
of the activities preceding the activity. For activity B,D and E there is only one predecessor
activity i.e activity A and EF of A is 5, so [ES, EF] of B is [5,9];[ES, EF] of D is [5,8] and[ES,
EF] of E is [5,6] . Similarly for C and F the [ES, EF] are [9,15] and [6,10] respectively. The ES
time of G has to be the maximum of EF’s of the two preceding activities D (EF=8) and
F(EF=10). Therefore the ES of G is 10 and EF is 24 (10+14). The remaining values are
calculated and given in Table 3.
1.3.4 Determination of Late Start and late Finish times- Backward Pass
The concept of the backward pass is to compute the latest allowable times of starting and
finishing, LS and LF for each of the activities of the project. The term ‘ latest allowable “ means
how much an activity can be delayed without delaying the project completion time. The
computations for the backward pass start at the terminal event and move towards the start event.
The terminal node is assigned the latest of EF times of activities merging into it. In our example,
there is only one terminal activity, so the time assigned to node 8 will be 31. This implies that the
latest finish (LF) time of activity I is equal to 31. The formula for Latest start time is:
𝐿𝑆 = 𝐿𝐹 − 𝑡
𝑤ℎ𝑒𝑟𝑒, 𝑡 𝑖𝑠 𝑡ℎ𝑒 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦 𝑡𝑖𝑚𝑒
The LS time for the activity being equal to its LF time minus its duration, so for G the LS would
be 31-4=27.In respect of others, the LF time for an activity would be set as equal to the smallest
or minimum of the LS times of its successor activities. The LF time of activities G and H would
be equal to 27, the LS of only succeeding activity I. The latest start and completion times of
activities F, E, D C and B are similarly calculated, as they have one succeeding activity.
However activity A has three succeeding activity- B,D and E. In this case, the minimum of ES
times of these three activities will be taken as the EF of activity A. In our example the ES of
activities B, D and E are all 5, so EF of activity A is 5 and the ES is 0. All the calculated latest
finish times are given in Table 3.
Once the forward pass and backward pass times are computed, it becomes very easy to calculate
the critical path. If the early start and late start or early finish and late finish values are equal,
then the activity is referred as a critical activity. If the values are not equal, the activity is termed
as non critical. The path consisting of critical activities is called a critical path.
1.3.5 Determination of float
The concept of float is of paramount importance to a project manager. Every critical activity in a
network cannot be scheduled later than their earliest schedule time without delaying the project
duration. However, non-critical activity can be scheduled later and allows exercising control over
time, resources, or cost. This flexibility is seen in terms of the float or slack that any activity has.
It is the time available to an activity in addition to its duration. Since each activity has four
associated times, four types of floats can be identified. In practice, only three are used and
discussed here:
Total Float: The total float of an activity represents the amount of time by which it can be
delayed without delaying the project completion date. It is equal to the difference between the
total time available for the performance of an activity and the time required or its performance.
For any activity, the total float is calculated as follows:
𝑇𝑜𝑡𝑎𝑙 𝐹𝑙𝑜𝑎𝑡 = 𝐿𝐹 − 𝐸𝐹
= 𝐿𝑆 − 𝐸𝑆
= 𝐿𝐹 − 𝐸𝑆 − 𝑡
Where t is the activity time
In our example, for activity D,𝑇𝑜𝑡𝑎𝑙 𝐹𝑙𝑜𝑎𝑡 = 𝐿𝐹 − 𝐸𝐹 = 10 − 5 = 5,
= 𝐿𝑆 − 𝐸𝑆 = 13 − 8 = 5
Free Float:The free float is that part of the total float which can be used without affecting the
float of the succeeding activities. The free float is calculated as the earliest start time for the
following activity (j) minus the earliest completion time for this activity (i).
Independent Float: The independent float time of an activity is the amount of float time which
can be used without affecting either the head or tail events. The value of independent float is as
follows, if ‘i’ is the preceding activity, ‘j’ is the succeeding activity and ‘t’ is the duration of
activity
𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐿𝐹𝑖 − 𝑡
In our example, for activity D 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝐹𝑙𝑜𝑎𝑡 = 𝐸𝑆𝑗 − 𝐿𝐹𝑖 − 𝑡 = 10 − 5 − 3 = 2
The independent float is always either equal to or less than the free float of an activity. A
negative value of independent float may be obtained, but in that case independent float is taken
as zero. Based on the data given in Table 2, the Earliest and latest times and floats can be
calculated as below:
Table 3: Calculation of Earliest and latest times and float
In the previous section, the critical path and the project length were determined on the basis of
activity times that were assumed to be known and constant. However in reality in most projects
these activity times are unlikely to be predicted correctly. In PERT, we assume that it is not
possible to estimate the time for each activity precisely and instead probabilistic estimates of
time are only possible. This method uses three time estimates for an activity. They are:
• Optimistic Time (a). This is the shortest time the activity can take to complete. It is based
on the assumption that there will not be any difficulty in completing the work
• Most likely time (m) This refers to the time that would normally take to complete an
activity. The most likely time estimate is between the optimistic and pessimistic time
estimate.
• Pessimistic time (b) This is the longest time the activity could take to finish. It assumes
that unexpected problems can occur during the execution of the activity
Depending on the values of a, m, and b, the resulting distribution of activity duration can take a
variety of forms. Typically the activity completion times is assumed to follow beta distribution
as shown in figure 1. The beta distribution is a skewed curve, which can be either positively or
negatively skewed. The below one is a positively skewed curve.
The expected times (te) of various activities is the time estimate based on the weighted arithmetic
mean of a, m and b. It can be calculated as follows:
𝑎 + 4𝑚 + 𝑏
𝑡𝑒 =
6
The variance σ2 of the completion time of an activity is calculated as follows:
2 𝑏−𝑎 2
𝜎 =( )
6
To demonstrate the use of PERT, let us take an illustration. Instead of a single estimate, there are
three time estimates.
Table 4: Three time estimates of activity times
Activity Predecessor Time estimates
Activity Optimistic (a) Most likely( m) Pessimistic (b)
A - 1 4 7
B A 2 6 7
C D 3 4 6
D A 6 12 14
E D 3 6 12
F B,C 6 8 16
G E,F 1 5 6
First let us draw the project network reflecting the precedence relationships:
3
B F
A G
1 2 4 5
C
D E
Next we need to find the expected activity times and variance and then we can apply the
concepts learnt earlier to compute critical path. The calculations of expected times and variance
are shown in given table5
Once the expected times of the activities are obtained, the critical path of the project network is
determined using these time estimates. PERT methodology assumes that the summation of
expected times and variances of the critical activities would yield the expected project duration
and its variance
Path Length
23
(i) A→B→F→G
26.33
(ii) A→D→E→G
(iii) A→D→C→F→G 33
The path (A→D→C→F→G) is the critical path, as it takes the longest period of time to
complete i.e 33 weeks.
𝑥 − 𝑡𝑒
𝑧=
√∑ 𝜎𝑝2
µ=33 x=34
It is observed from the z table that the probability value of z=0.39 is 0.1517. However as we
studied in unit 4 of block 1, this area is from the mean and we need to find the total shaded area
as shown in the above figure. The desired probability is 0.5+ 0.1517= 0.6517, so we can say
there is 65 % chance of completing the project by the desired time.
Check Your Progress 3
The following table of probabilistic time estimates (in weeks) and activity predecessors
are provided for a project.
Time Estimates (weeks)
Activity A M b Activity
Predecessor
A 3 5 7 --
B 4 8 10 A
C 2 3 5 A
D 6 9 12 B, C
E 5 9 15 D
4. Using given data, the variance of the project’s total completion time is
a. 5.472 weeks.
b. 5.222 weeks.
c. 4.872 weeks.
d. 3.752 weeks.
5. Using given data, the probability that the project could be completed in 34 weeks or less is
approximately
a. 86 percent.
b. 89 percent.
c. 91 percent.
d. 96 percent.
1. (a)
Solution: t=(3+4*5+7)/6=5.0 weeks
2. (d)
Solution: variance=((15-5)/6)2=2.778 weeks
3. (c)
4. (b)
5. (c)
Solution: Z=(34-31)/2.285=1.31, Pr.=0.91
1.7 Glossary
Program Evaluation and Review Technique (PERT): A network based project scheduling
techniques with uncertain activity times.
Critical path method (CPM): A network based scheduling technique with certain activity
times.
Activities: Specific jobs or tasks that are components of a project.
Immediate Predecessor: The activities that must be completed immediately prior to the start of
a given activity
Project Network: A graphical representation of a project that depicts the activities and shows
the predecessor relationships among the activities
Critical Path: The longest path in a project network.
Earliest Start Time: The earliest time an activity can begin.
Earliest Finish Time: The earliest time an activity can be completed
Latest start Time: The latest time an activity may begin without increasing the project
completion time.
Latest Finish Time: The latest time an activity may be completed without increasing the project
completion time.
Float/Slack: The length of the time an activity can be delayed without affecting the project
completion time
Optimistic Time: The minimum activity time if everything progresses ideally
Most Probable Time: The most probable activity time under normal conditions.
Pessimistic Time: The maximum activity time if significant delays are encountered.
Expected Time: The average activity time
Beta Probability Distribution: A probability distribution used to describe activity times
1.8 Assignment
1.State the rules of constructing a network.
2.What is critical path? State the necessary and sufficient conditions of critical path. Can a
project have multiple paths?
3.Explain the concept of float? Distinguish clearly between free and independent float.
4.A small project consists of seven activities for which relevant data is given below:
1.9 Activities
You are made in charge of planning and coordinating next sales management training program of
your company. List out the activities that, needs to be done to organize the program with
assumed activity times and develop a network.
Food Solutions ltd distributes a variety of food products that are sold through grocery stores and
supermarket outlets. The company receives orders directly from the individual outlets with a
typical order requesting the delivery of several cases of anywhere from20 to 50 different
products. Under the company’s current warehouse operationwarehouse clerks dispatch order
picking personnel to fill each order and have the goods moved to the warehouse shipping area.
Because of the high labour costs and relatively low productivity of hand order picking,
management decided to automate the warehouse operation by installing a computer controlled
order picking system, along with a conveyor system for moving goods from storage to the
warehouse shipping area.
The director of material management has been named the project manager in charge of the
automated warehouse system. After consulting with members of the engineering staff and
warehouse management personnel, the director compiled a list of activities associated with the
project. The optimistic, most probable and pessimistic times have been also seen provided for
each activity.
Develop a report that presents the activity schedule and expected project completion time for the
warehouse expansion project. The top management of Food solutions established a required 40
week completion time for the project. Can this completion be achieved? Include probability
distribution in your discussion. What recommendations do you have if the 40 week completion
time is required?
1.11Further Reading
2.1 Introduction
2.11Glossary
2.12Assignment
2.13Activities
2.14Case Study
2.15Further Reading
2.0Learning Objectives
2.1Introduction
Waiting Lines are formed when there are more arrivals than what can be handled at the service
facility and no waiting line will be formed if arrivals are less than that. Thus lack of adequate
facility would cause waiting lines of customers to be formed. At times the time required to be
spent in a waiting line by customer is undesirable. The only way the demand in service can be
met is to increase the service capacity or service efficiency to higher level ( if possible). The
service capacity can be build to such a level that the demand at the peak time can be met. But
adding more number of checkout clerks, bank tellers or servers is not always the most
economical strategy for improving service, as the system will remain idle when there are few or
no customers. The managers therefore needs to decide an appropriate level of service which is
neither too low nor too high, so that waiting time can be kept within tolerable limits. The
objective of waiting line models is to provide such information to managers that they are able to
make decisions to balance desirable service levels against the cost of providing the service.
Queuing System
Input
Source Queue Service
System
Customers
leave
the system
Arrival
Process
The arrivals from the input populations can be classified on different basis as follows:
Source of arrival:Customer arrivals at a service system may be drawn from afinite or infinite
population. For example all the people of the city can be potential customers for a supermarket.
The number of people being very large, it can be taken as infinite. An infinite population is large
enough in relation to the service system so that the population sixe caused by subtraction or
additions to the population does not significantly affect the system probabilities. However there
are business situations where the population is considered finite. For example, consider a group
of six machines being maintained by one repairman. When one machine breaks down, the source
population is reduced to five and the chance of another machine breaking down is less than when
six machines were operating. The probability of another breakdown is again changed if two
machines are down, with only four operating.
Size of arrival: The customers may arrive for service individually or in groups. Single arrival
are illustrated by customers visiting banks, saloons etc. On the other hand families visiting
restaurants, shipments getting loaded in trucks are example of bulk or batch arrivals.
Arrival Distribution:Defining the arrival process for a waiting line involves determining the
distribution of customer arrival times. The queuing models wherein the number of arrivals in a
given period of time is known with certainty are known as deterministic models. On the other
hand for many waiting line situations the arrivals occur randomly and independently of other
arrivals and we cannot predict when an arrival will occur. In such cases, a frequently employed
assumption is that the Poisson probability distributionprovides a good description of the arrival
pattern.
Degree of Patience: A patient arrival will wait as long as the service facility is ready to serve.
There are two types of impatient arrivals. Members of the first class arrive, view the service
facility and length of the line and then decide to leave. Those in the second class arrive, view,
wait in line and after some time leave. This behavior of first type is known as balking and
second is termed as reneging.
2.2.2 Queue Structure
In queue structure the important thing to know is the queue discipline which means the set of
rules for determining the order of service to customers in a waiting line. The most common
disciplines are:
1. First come First Served(FCFS)
2. Last come first served(LCFS)
3. Service in random order(SIRO)
4. Priority Service/reservations
There are two aspects to the service system- (1) the structure of the service system (2)
Distribution of service time.
Structure of service system: The structure of a service system means how the service facilities
exist. Waiting line processes are generally classified into four basic structures: Single-channel
single-phase, single-channel multiple-phase, multiple-channel single-phase and multiple-channel
multiple-phase. Channels are the number of parallel servers and phases denote the number of
sequential servers. A bank with a single clerk providing service to a single line of customers is
an example of single-channel single-phase queuing system. If several clerks are providing
service to a single line of customers, it will be an example of multiple-channel single-phase
system. An example of single-channel multiple-phase system is the manufacturing assembly line
type operation in which the product goes through several sequential machines at workstations to
be worked on. If there are two or more assembly lines manufacturing the same product, it is an
example of multiple-channel multiple-phase.
Distribution of Service Time: The service time is the time a customer spends at the service
facility once the service has started. Waiting line formulas generally specify service rate as the
number of units served per unit of time. A constant service time rule states that each service
takes exactly the same time, as in case of automated operations. When service times are random,
they can be approximated by the exponential probability distribution.
The techniques of waiting line analysis do not provide an optimal or best solution. Instead it
generates certain measures referred to as the operating characteristics that describe the
performance of the queuing system. The management uses these measures to evaluate the system
and take decisions. It is assumed that in long run the performance measures will approach
constant average values, which is referred to as steady state. The following notations are used to
define the basic operating characteristics:
In each of these models the customer arrivals follow Poisson distribution. If the arrivals are
independent with a mean arrival rate of ʎ per period of time, the Poisson probability function
provides the probability of x arrivals in a specific time period as (discussed in detail Block 1,
Unit 3).
𝜆𝑥 𝑒 −𝜆
𝑃(𝑥) =
𝑥!
Where, x= number of arrivals in the time period
ʎ= mean number of arrivals per time period
e= 2.71828
For the first two models, the service times are distributed exponentially. Using exponential
probability distribution, the probability that the service time will be less than or equal to a time of
length t is(discussed in detail Block 1, Unit 4):
𝑃( 𝑠𝑒𝑟𝑣𝑖𝑐𝑒 ≤ 𝑡) = 1 − 𝑒 −𝜇𝑡
Where, µ= mean number of units can be served per unit time period
e=2.71828
Further in each of the models the customer service is assumed to be in first-come- first –served
order (FCFS). Now we will describe each of the models in detail.
To evaluate a model we need to first check whether a service station can handle the customer
demand of service. If ʎ ≥ µ, the waiting line will increase infinitely and the system will collapse.
For the system to be functional arrival rate should be less that service rate (ʎ< µ).
The following Formulas are used to compute the steady state operating characteristics:
1. Probability that system is busy or probability that a customer has to wait for service :
𝜆
𝜌=
𝜇
Where 𝜌 𝑖𝑠 𝑟ℎ𝑜, 𝑎𝑙𝑠𝑜 𝑘𝑛𝑜𝑤𝑛 𝑎𝑠 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑜𝑟 𝑈𝑡𝑖𝑙𝑖𝑠𝑎𝑡𝑖𝑜𝑛
2. The probability that zero units are in the system or probability that system is idle
𝜆
𝑃0 = 1 − 𝜌 = 1 −
𝜇
3. Probability of exactly n customers in the system::
𝑛
𝜆 2
𝑃𝑛 = 𝜌 𝑃0 = ( ) 𝑃0
𝜇
4. Average/expected number of customers in the system:
𝜆 𝜌
𝐿𝑠 = 𝑜𝑟
𝜇−𝜆 1−𝜌
5. Average/expected number of customers in the queue:
𝜆2 𝜌2
𝐿𝑞 = 𝑜𝑟
𝜇(𝜇 − 𝜆) 1−𝜌
6. Average waiting time in queue:
𝜆 𝜌
𝑊𝑞 = 𝑜𝑟
𝜇(𝜇 − 𝜆) 𝜇−𝜆
7. Average waiting time in system:
1
𝑊𝑠 =
𝜇−𝜆
Illustration2: A repairman finds that the time spent on the job has an exponential distribution
with mean 30 minutes. If he repairs machines at an average rate of 10 per 8 hour day, what is the
expected idle time each day? How many jobs are ahead of the set just brought in? What is the
probability that four machines are waiting to get repaired?
Here the arrival rate ʎ = 10 machines/day and mean time of servicing is 30 minutes. It means in
one hour 2 machines and in a day (2x8) =16 machines are repaired. So µ= 16 machine/day.
To determine the number of jobs just brought in, we should be calculating average no of
machines in the system
𝜆 10 5
𝐿𝑠 = = = = 1.67 𝑚𝑎𝑐ℎ𝑖𝑛𝑒𝑠
𝜇 − 𝜆 16 − 10 3
Probability that four machines are waiting means in total there are five machines in the system.
𝜆 2 10 2
𝑃𝑛 = ( ) 𝑃0 = ( ) × 0.375 = 0.1465
𝜇 16
The following Formulas are used to compute the steady state operating characteristics for multiple
–channel waiting lines, where
4. Probability that a customer arriving in the system must wait for service(i.e. all the servers
are busy) is
𝐾
(𝜆⁄𝜇 ) 𝐾𝜇
𝑃𝑤 = [ ]𝑃
𝐾! 𝐾𝜇 − 𝜆 0
5. Average number of customers in the waiting line:
𝐾
(𝜆⁄𝜇 ) 𝜌
𝐿𝑞 = (𝑃0 )
𝐾! (1 − ρ)2
6. Average number of customers in the system:
𝜆
𝐿𝑠 = 𝐿𝑞 +
𝜇
7. Average waiting time in queue:
𝐿𝑞
𝑊𝑞 =
𝜆
8. Average waiting time in system:
1
𝑊𝑠 = 𝑊𝑞 +
𝜇
Illustration 3: The customer care centre of a departmental store help the customers with their
questions or complaints or issues regarding credit card bills. There are chairs placed along the
wall making it a single waiting line. The customers are served by three store representatives, and
customers on a first come first serve basis. The store management wants to analyse this queuing
system as excessive waiting ties can make customers angry enough to shop at other stores. A
study of the customer service department for a 6 month period shows that an average of 10
customers arrive per hour and an average of 4 customers can be served per hour by a customer
care representative.
Here , ʎ = 10 customers/hour
µ= 4 customers/hour
K= 3 customer representatives
Kµ=3x4=12 (>ʎ)
Using the multiple server model formulas, we can compute the following operating
characteristics for the departmental store:
= 0.045
Probability that a customer arriving in the system must wait for service(i.e. all the threeservers
are busy) is
𝐾
(𝜆⁄𝜇 ) 𝐾𝜇
𝑃𝑤 = ( )𝑃
𝐾! 𝐾𝜇 − 𝜆 0
3
(10⁄4) 3×4
= ( ) 0.045
3! 3 × 4 − 10
=0.703
The department stores management has observed that customers are frustrated by the waiting
time of 21 minute and the 0.703 probability of waiting. The management is considering
employing an additional service representative to improve the level of service. The operating
characteristics for this system must be recomputed with K=4 service representatives: P0= .073,
Pw=0.31, Ls=3 customers, Ws= 18 minutes, Lq=0.5 customers, Wq= 3 minutes
The waiting time is considerable reduced from 21 minutes to 3 minutes. However, this
improvement in the quality of the service would have to be compared with the cost of adding an
extra service representative, before taking any decision.
2.7 Single- Channel with Poisson arrivals and Arbitrary Service Times(M/G/1)
This model is based on following assumptions:
• The arrivals follow Poisson distribution with a mean arrival rate of ʎ
• The service time has a general probability distribution with a mean service rate of µ and
standard deviation of σ.
• There is a single service station
• A single waiting line is formed
• Customers are served on FCFS basis
The following Formulas are used to compute the steady state operating characteristics for M/G/1
model is, where
Here the arrival rate ʎ= 21/hour or 21/60=0.35 customer per minute (converted to minutes, as
rest of the data is in minutes). The mean service time of 2 minutes shows that the service rate of
the clerk is 1/2 = 0.50 customers per minute.
The information we derive from the operating characteristics of various models can be used to
determine the appropriate level of service. Inadequate service would cause excessive waiting
which has a cost in terms of customer frustration, loss of goodwill, direct cost of idle
machines(machines to be used in production waiting for repair work) etc. On the other hand, high
service level would result in higher set up cost and idle time for service station. Thus the goal of
queuing modeling is the achievement of an economic balance between the cost of providing
service and the cost associated with the waiting time for service. The optimum level of service
would be where the total of waiting time cost and cost of providing service is minimum. Figure 1,
shows that increasing the service level result in increasing the cost of service and reducing the cost
of waiting time.
The thick curve shows that the total cost decreases to a point and then start increasing. The service
level corresponding to the minimum point on it is the optimum service level.
Cost of waiting(𝐶𝑤 ) = Down time cost for 0.75 machines = 250 × 0.75 = 187.5 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Cost of service (𝐶𝑠 ) for two worker = 160 × 2 = 320 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Total cost per hour = 𝐶𝑤 + 𝐶𝑠 = 187.5 + 320 = 𝑅𝑠 507.5
Cost of waiting(𝐶𝑤 ) = Down time cost for 0.60 machines = 250 × 0.60 = 150 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Cost of service (𝐶𝑠 ) for three worker = 160 × 3 = 480 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟
Total cost per hour = 𝐶𝑤 + 𝐶𝑠 = 150 + 480 = 𝑅𝑠 630
Comparing the cost of one, two and three workers, the total cost is lowest in Case II. Hence the
optimal solution is hiring 2 workers.
2.9Let Us Sum Up
Waiting line theory deals with situations where customers arrive, wait for the service, get the
service and leave the system. In this unit we discussed a variety of waiting line models that have
been developed to help managers make better decisions concerning the operation of waiting
lines. The formulas required to compute operating characteristics or performance measures for
each model were presented. The operating characteristics include- Probability that system is idle,
Average number of customers in system, average number of customers in queue, average time a
unit spends in the waiting line, average time a unit spends in system, probability that arriving
customers have to wait for service .
Queuing structure are analyzed for determining the optimum level of service, where the total cost
of providing service and waiting is minimized. An increase in the level of service increases the
cost of providing service but reduces the cost of waiting. While the waiting line models can be
deterministic as well, the probabilistic ones are commonly occurring and analyzed. Three models
discussed in this unit include- Single-channel Poisson- arrival with exponential- service times
(M/M/1), Multiple-channel, Poisson arrivals with exponential service time(M/M/C) and Single –
Channel, Poisson- arrival with arbitrary service times. For a queuing system to be functional the
arrival rate of the customers per unit of time should be less than the service rate.
2.11 Glossary
2.12 Assignment
1. Which assumptions are necessary to employ (M/M/C) waiting Line Model?
2. Discuss the waiting line system in detail with some queuing situations.
3. Describe a single server waiting line mode. Give an example from reallife, for each f the
following queuing models
a. First come first serve
b. Last come last serve
4. The mechanic at Carpoint is able to install new mufflers at an average of three per hour
while customers arrive at an average rate of 2 per hour. Assuming that the conditions for
a single –server infinite population model are all satisfied, calculate the following:
a. Utilisation parameter
b. The average number of customers in the system
c. The average time a customer spends in the queue
d. The probability that there are more than three customers in the system.
5. A service station has five mechanics each of whom can service a scooter in 2 hours on an
average. The scooters are registered at a single counter and then sent for servicing to
different mechanics. Scooters arrive at the service station at an average rate of 2 scooters
per hour. Assuming that arrivals are Poisson distributed and servicing times are
distributed exponentially, determine:
a. The probability that system is idle
b. The probability that there are 3 scooters in the service centre
c. The expected number of scooters waiting in the queue
d. The average waiting time in the queue.
2.13Activities
Analyse the following queuing systems by describing their various system properties:
a) Hospital Emergency Room
b) Traffic light
c) Computer system at university
2.14 Case Study
A fast shop drive in market has one checkout counter where one employee operates the cash
register. The combination of the cash register and the operator is the server in this queuing
system; the customers who line up to pay for the selected items form the waiting line. Customers
arrive at rate of 24 per hour according to a Poisson distribution and service times are
exponentially distributed with a mean rate of 30 customers per hour.
The arrival rate of 24 per hour means that on an average a customer arrives about every 2.5
minutes (60/24). This indicates the store is busy. Because of the nature of the store, customers
purchase few items and expect a quick service. Customers expect to spend more time in a
supermarket where they make larger purchases but they shop at a drive-in market because it is
quicker than a supermarket. Given customer’s expectations, the manager believes that it will be
unacceptable for a customer to wait beyond 5 minutes in the waiting line.
The market manager wants to determine the operating characteristics for this waiting line system
and wants to test if hiring another employee to pack up purchases will help in reducing customer
waiting time and still be economically viable. An extra employee will cost the market manager
$150 per week. With the help of market research agency, the manager had determined that for
each minute that customer waiting time is reduced; the store avoids a loss in sales of $75 per
week. The service rate with two employees will be 40 customers per hour.
3.1Introduction
The models and techniques we discussed so far in operations research were involving interest of
an organization. For example in transportation problem we are interested in minimization of cost
or maximization of profits given the organizational constraints. However in real life situations,
decision making is often taken where two or more rational opponents are involved under
conditions of competition and conflicting interest. Game theory deals with processes where an
individual or a group or a organization is not in complete control of other player, the opponent
and addresses situations involving conflict, co-operation or both at different levels.
The main objective of the game theory is to determine the rules of rational behavior in the
situations in which the outcomes are dependent on the actions of the interdependent players. A
game is a situation in which two or more players are competing. The players may have different
objectives but their fate is intertwined. They might have some control that will influence the
outcome but they donot have complete control over others .Game Theory is the analysis (or
science) of rational behavior in interactive decision-making. It is therefore distinguished from
individual decision-making situations by the presence of significant interactions with other
‘players’ in the game. Game Theory can be used to help in explaining past events and
situations, predict what actions players will take in future games, and based on it take decisions
in interactions with other players to achieve the best outcome.
Game theory models can be classified on the basis of factors like number of players involved,
sum of the gains or losses and the number of strategies employed.
If there are two participants in a game it is called two-person game and if more than two
participants are involved, it is a n-person game. In a game, if the sum of the gains and losses is
equal to zero, it is called zero- sum or constant-sum game. If the sum of the gains and losses is
not equal to zero, it is called non-zero-sum game. A game is said to be finite if each player has
the option of choosing from only a finite number of strategies, or else it is called infinite.
Some of the key concepts to be used in game theory are described below:
Players: The competitors or decision makers in a game are called the players of the game.
Strategies: The alternative courses of action available to a player are called as strategies
Payoff: The outcome of playing a game is called the payoff to the concerned player.
Optimal Strategy: A strategy in which the player can achieve the maximum payoff is called the
optimal strategy.
Payoff Matrix: The tabular display of the payoffs of the players under various alternatives is
called the payoff matrix.
Pure strategy: A game solution that provide a single best strategy for each player.
Mixed strategy: If there is no one specific strategy as the best strategy for any player in a game,
then the game is referred to as mixed strategy or a mixed game. Each player has to choose
different alternative courses of action from time to time.
3.3.1Payoff Matrix:
When players select particular strategies, the payoff can be represented in the form of a payoff
matrix. Suppose firm A has m strategies and firm B has n strategies, a payoff matrix will be
Player B’s Strategies
B1 B2……Bn
A1 a11 a12 a1n
A2 a21 a22 a2n
Player A’s Strategies
. . . .
. . . .
Am am1 am2 amn
The matrix is in terms of player A’s point of view. Player A wishes to gain as large a payoff a ijas
possible, while player will do his best make it as small a value of aij as possible.
Let us assume that both the firms A and B are considering three strategies to gain the market
share- advertising, promotion and quality improvement. The strategies of advertising , promotion
and better quality is represented as A1,A2 and A3 respectively for firm A and B1, B2 and B3
respectively for firm B. As shown below in matrix, in total there are 3x3=9 combinations of
moves. Each pair of moves shall affect the share of market in a particular way. As the payoff is
in terms of A- a positive payoff indicates that A had gained at the expense of firm B while
negative pay-offs imply B’s gain at A’s expense. For example, strategy of advertising by both
firm A and B will lead to 12 % market share gain for firm A, while advertising by A and
promotion by B, would lead to a shift of 8 % market share in favour of B. Similarly there are
pay-offs corresponding to other pairs of moves.
B’s Strategy
B1 B2 B3
A’s Strategy A1 12 -7 -2
A2 6 7 3
A3 -10 -5 2
The conservative approach in selection of best strategy would call for assuming the worst to
happen and act accordingly. In reference to the pay off matrix, if firm A employs A1 strategy it
would expect the firm B to employ strategy B2, thereby reducing A’s payoffs from the strategy
A1 to its minimum value of -7, representing a loss to firm A. If the firm employs A2 strategy, it
would expect the firm B to employ B3 strategy which would give a three percent gain in market
share. Similarly for strategy A3, it will expect Firm B to employ B1 strategy, with a loss of 10
percent. The firm A would like to make the best of the situation by choosing the strategy which
gives maximum of these minimum pay-offs. Since the minimal payoff to strategies A1,A2 and A3
are -8, 3 and -10 respectively; firm A would select A2 as its strategy. This decision rule is called
the Maximin Strategy.
Firm B would also employ a similar conservative approach. When B employs B1 strategy, it
expect firm A to employ A1, which gives maximum gain to A. In a similar way, adoption of
B2and B3, would make it expect firm A to adopt strategy A2. To minimize the gain of the
competing firm, firm B would select the strategy which would yield the least gain to firm A .
This decision rule of firm B is called Minimax strategy.
As discussed above, it is clear that maximin strategy A2 of firm and the minimax strategy of firm
B, both lead to the same payoff. These strategies are based on the conservative approach of
choosing the best strategy,by assuming that the worst will happen. By adopting the maximin
strategy A can stop B from lowering its gain in the market share below 3 percent and by adopting
minimax strategy firm B can stop A from gaining more than 3 percent market share. The
situation is therefore, one of equilibrium. The point of equilibrium is known as the saddle point.
To obtain the saddle point, if it exists, we determine the minimum payoff value for each row and
maximum pay off value for each column. If maximum of row minima is equal to the smallest of
the column maxima, then it represents the saddle point. For the illustration, lets continue with the
same problem:
It is also possible to have more than one saddle points for a given problem. For example consider
the following Matrix
B’s Strategy Row Minima
B1 B2 B3 B4
A’s Strategy A1 2 15 13 -14 -14
A2 -5 6 -4 -5 -5*
A3 5 -2 0 -5 -5*
Column Maxima 5 15 13 -5*
In relation of B’s minimax strategy, A form could employ either A1 or A2, each of which
represents the maximin strategy for it. As the pay-off corresponding to B’s minimax strategy and
A’s either maximin strategies is identical, there are two saddle points, represented by A2B4 and
A3B4. The value of the game is -5, a net loss of 5 point to A and an equivalent profit of B
Illustration 1: Soul Ltd had forecasted sales for its products and products of competitors, Pure
Ltd. There are four strategies for soul Ltd- S1, S2, S3, S4 and three strategies available to pure
Ltd- P1, P2, P3. The payoffs to all the twelve combinations are given below. Considering the
information, state what would be the optimal strategy for Soul Ltd? Pure Ltd ? What is the value
of the game? Is the game fair?
Pure’s Strategy
P1 P2 P3
S1 30000 -21000 1000
S2 18000 14000 12000
Soul’s Strategy S3 -6000 28000 4000
S4 18000 6000 2000
For determining the optimal strategies, we should examine if saddle point exists for the given
problem:
Here the saddle point exists at S2P3. The optimal strategy for Soul’d Ltd is S2 and for Pure Ltd is
P3 respectively. The value of the game is V=12000, a gain of 12000 to Soul’s Ltd. Since V≠ 0, it
is not a fair game.
It is possible that there is no saddle point of a game and hence it is not possible to find solution in
terms of pure strategies- the maximin and minimax rule. To solve such problems we need to
employ mixed strategies. A mixed strategy represents a combination of two or more strategies
that are selected one at a time, with pre-determined probabilities. Therefore in mixed strategy, a
player decides to choose among various alternatives in a certain ratio.
Illustrration2: The following is the pay-off matrix of a game being played by A and B.
Determine the optimal strategies for the players and the value of the game.
As can been seen from the table, the maximin value is not equal to the minimax value, implying
there is no saddle point in this problem.
With mixed strategies, let the player A employs A1 strategy with a probability of x and A2
strategy with a probability of (1-x) . If B plays strategy B1, the A’s expected payoff can be
determined from the first column of the pay-off matrix as follows:
Similarly, if B plays strategy B2, the expected payoff of A can be determined as follows:
We shall find a value of x so that the expected payoff for A is the same irrespective of the
strategy adopted by B. This can be obtained by equating the two equations and solving it:
9𝑥 − 5(1 − 𝑥) = −6𝑥 + 5(1 − 𝑥)
or 9𝑥 − 5 + 5𝑥 = −6𝑥 + 5 − 5𝑥
or 25 𝑥 = 10
or 𝑥 = 10 /25 = 2/5
A will do best by choosing A1 and A2 strategy in the proportion 2:3 ( i.e A1 2/5 times & A2 3/5)
The expected pay-off for A applying mixed strategy is :
2 2
9𝑥 − 5(1 − 𝑥) = 9 × − 5 (1 − ) = 3/5
5 5
or
2 2
−6𝑥 + 5(1 − 𝑥) = −6 × + 5 (1 − ) = 3/5
5 5
Thus Firm A will have a net gain of 3/5 in long run.
We can determine the mixed strategy of B in a similar way. Thus if player B plays B1 with a
probability of y and B2 with a probability of (1-y), then
Thus B would play strategies B1 and B2 in the ratio 11:14 in a random manner.
The expected pay-off for B applying mixed strategy is:
11 11
9𝑦 − 6(1 − 𝑦) = 9 × − 6 (1 − ) = −3/5
25 25
or
11 11
−5𝑦 + 5(1 − 𝑦) = −5 × + 5 (1 − ) = −3/5
25 25
Thus, we conclude that A and B should both use mixed strategies as given below and the value
of the game in long run is 3/5
Strategy Probability
For A, A1 2/5
A2 3/5
For B, B1 11/25
B2 14/25
In general, for a zero sum two-person game , in which each of the players A and B has strategies
A1, A2 and B1 B2 respectively and the payoffs are given below, if x is the probability of player A
choosing strategy A1 and y is the probability of player B choosing strategy B1:
B’s Strategy
B1 B2
A’s Strategy A1 A11 A12
A2 A21 A22
Then,
𝐴22 − 𝐴21
𝑥=
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 )
𝐴22 − 𝐴12
𝑦=
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 )
By substituting the values in the above equation, we get matching the already obtained ones:
5+5 10 2
𝑥= = =
(9 + 5)— (−6 − 5) 25 5
5 − (−6) 11
𝑦= =
(9 + 5)— (−6 − 5) 25
9 × 5 − (−5)(−6) 15 3
𝑉= = =
(9 + 5)— (−6 − 5) 25 5
3.5Principle of Dominance
Sometimes, a strategy available to a player is found better to some other strategy/strategies. Such
a strategy is known to dominate the others. This concept is useful in simplifying the games and
finding solution to a game problem. Consider the following example:
B’s Strategy
B1 B2 B3
A’s Strategy A1 0 -1 2
A2 5 4 -3
A3 2 3 -4
Lets us follow the usual procedure for identifying a pure strategy, we compute the row minima
and column maxima as below:
B’s Strategy Row Minima
B1 B2 B3
A’s Strategy A1 0 -1 2 -1*
A2 5 4 -3 -3
A3 2 3 -4 -4
Column Maxima 5 4 2*
The maximum of row minima is -1 and the minimum of column maxima is 2. As the maximin
and minimax values are not equal, the two person zero sum game does not have an optimal pure
strategy. For a problem larger than 2 X 2 matrix, we cannot apply the mixed strategy
probabilities using algebraic equation, as we did in the previous section.
If the game is larger than 2 X 2 requires a mixed strategy, we need to reduce the size of the
matrix by looking for dominated strategies. A dominant strategy exists if another strategy is at
least as good regardless of what opponent does. For example, for strategies A1 and A2 in the
column B1, 5>2, in column B2, 4>3 and in the column B3, -3>-4. Thus regardless of what the
player B does, player A will always choose higher values of strategyA2 as compared to A3.
Therefore we can say strategy A2 dominates strategy A3, and A3 strategy can be dropped from
consideration of player A. This helps us to reduce the size of the game. After eliminating, the
game becomes:
B’s Strategy
B1 B2 B3
A’s Strategy A1 0 -1 2
A2 5 4 -3
Now if we compare A1 and A2, we cannot find dominated strategy. Next we look for dominating
strategies for player B. We should remember that player B looks for smaller values as the matrix
is in terms of A’s payoff. By comparing B1 and B2 strategies, in row A1, -1<0 , in row A2; 4<5.
Thus regardless of what player A does, Player B would always prefer the smaller values of
strategy B2 over strategy B1. Therefore B1 is dominated by strategy B2 and hence is eliminated.
B’s Strategy
B2 B3
A’s Strategy A1 -1 2
A2 4 -3
𝐴22 − 𝐴12 −3 − 2 −5 1
𝑦= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (−1 − 3) − (2 + 4) −10 2
Thus, we conclude that A and B should both use mixed strategies as given below and the value
of the game in long run is 1/2
Strategy Probability
For A, A1 7/10
A2 3/10
For B, B1 1/2
B2 1/2
There are problems, where after applying dominance rule, game is reduced to a 2 X n or a m X 2
matrix. In such case, the problems can be solved graphically.
2. Given the following two person game, which strategy can be eliminated by
dominance rule
Y1 Y2
X1 9 13
X2 12 8
X3 6 14
a) X1
b) X2
c) X3
d) None of the above
When a player A has only 2 strategies to choose from and the player B has n, the game is of the
order 2 X n, whereas in case B has only two strategies and A has m strategies, the game is a m X
2 game.
The problem may be originally a 2 X n or a m X 2 game or might have been reduced to such size
after applying the dominance rule. By using the graphical method, the aim is to reduce the game
to the order of 2 X 2 by identifying and eliminating the dominated strategies and then solve by
the algebraic method as used earlier. The game value and optimal strategy can be read from the
graph, but generally algebraic method is adopted to get the answer.
Here, the payoff matrix consists of m rows and 2 columns. We will be discussing how to solve a
m X 2 game. The first step is to check whether the problem have a saddle point or not. As can be
seen below, this game has no saddle point
Next we try to simplify the matrix by applying dominance rule. In this problem the dominance
strategy cannot be applied and so we cannot simplify the matrix any further
Let y be the probability that player B select B1 strategy and (1-y) be the probability that player B
selects B2 strategy. When player A chooses to play A1, the expected payoff for B shall be 6y-
7(1-y) = 13y-7. Similarly the expected payoff of strategies A2, A3, and A4 are found and is
shown in below table. To plot graphically, the value of pay-off when y=0 and y=1 is also
calculated for each of the strategies.
𝐴22 − 𝐴12 −1 − 3 −4 1
𝑦= = = =
(𝐴11 + 𝐴22 ) − (𝐴12 + 𝐴21 ) (1 − 1) − (3 + 5) −8 2
Strategy Probability
For A, A1 0
A2 3/4
A3 0
A4 1/4
For B, B1 1/2
B2 1/2
If it is a 2 X n game, the expected pay-off values will be calculated for Player A, x will be the
probability of choosing A1 strategy and (1-x) will be the probability of choosing strategy A2.
The x axis will be used to represent ‘x’ values and payoff of player A will be y axis.The highest
interaction point in the lower boundary of the graph is the maximin point for Player A.
Summary of the Steps for solving Two- person Zero –sum games
1. Use the maximin strategy for player A and minimax strategy for player B to determine
whether a pure strategy solution exists. If there is a saddle point, it is the optimal solution.
2. If a pure strategy does not exist and the game is larger than 2 X 2 , identify a dominated
strategy to remove a row or a column. Develop a reduced pay-off table and continue to
use dominance rule to reduce as many rows and columns as possible.
3. If reduced game is 2 X n or m X 2, solve graphically to reduce it to a 2 X 2 matrix
4. If the reduced game is 2 X 2, solve for the optimal mixed strategy probabilities using
algebraic method.
If the game cannot be reduced to a 2 X 2 game, a linear programming model is used to solve
for the optimal mixed strategy probabilities, which is beyond the scope of this unit.
In this unit, we described how to solve two-person zero-sum games. In these games, the two
players end up with sum of the gain (loss) of one player and the loss (gain) to the other player is
always equal to zero. The steps that are used to determine whether a two-person zero-sum game
results in an optimal pure strategy were discussed. If a pure strategy exists, a saddle point
determines the value of the game. If an optimal strategy does not exists for a two-person zero-
sum 2 x 2 game, the algebraic method was used to derive the probabilities of mixed strategy. In
mixed strategy each player employs probability to select a strategy for each play of the game.
The dominance rule used for reduction of the size of mixed strategy game was also discussed. If
the elimination of dominated strategies can reduce a larger game to 2 X 2 game, an algebraic
solution procedure is used to find a solution. The solution of the n X 2 or m X 2 game, using the
graphical method was also discussed.
3.9Glossary
Game theory: The study of decision situations in which two or more players compete as
adversaries.
Two-person Zero-sum game: A game with two players in which the gain to one player is equal
to the loss to the other player.
Optimal Strategy: A strategy in which the player can achieve the maximum payoff is called the
optimal strategy.
Saddle point: A condition that exists when pure strategies are optimal for both players in a two-
person zero-sum game.
Payoff Matrix: The tabular display of the payoffs of the players under various alternatives is
called the payoff matrix.
Pure strategy: A game solution that provide a single best strategy for each player.
Mixed strategy: A game solution in which the player randomly selects the strategy to play from
among several strategies with probabilities.
Dominated strategy: A strategy is dominated if another strategy is at least as good for every
strategy that the opposing player may employ.
3.10 Assignment
1 What is game theory? What do you understand by ‘zero-sum” in the context of game
theory?
2 Explain the following: Saddle point, Pure strategy, Mixed strategy
3 Explain the concept of dominance with examples.
4 For the following Two-person, zero-sum game, find the optimal strategies for the two
players and value of the game:
B’s Strategy
B1 B2 B3
A’s Strategy A1 5 9 3
A2 6 -12 -11
A3 8 16 10
5 Solve the following game graphically
Player B’s Strategy
B1 B2
A1 3 4
Player A’s Strategy A2 -3 12
A3 6 -2
A4 -4 -9
A5 5 -3
3.11 Activities
Discuss applications of game theory with examples
3.11Case Study
Two television stations in a market compete with each other for viewing audience. Local
programming options for the 5.00 pm weekday time slot include a sitcom rerun, an early news
program or a travel show. Assume that each station has the same three programming options and
must make its preseason program selection before knowing what the other television station will
do. The viewing audience changes in thousands of viewers for station A as follows:
Station B
Sitcom, a1 News, a2 Travel a3
Sitcom, a1 70 80 50
Station A News, a2 90 60 95
Travel , a3 105 90 65
Determine the optimal programming strategy for each station. What is the value of the game?
In this block, we discussed some operation research techniques in detail. In the first unit business
situations pertaining to managing projects were discussed. The project management techniques
involve constructing network diagram using rules of networking. The project networking
technique with multiple estimates of activity time was also explained. In the second unit the
waiting line concept was introduced with its applications. The unit included some most
commonly used models of queuing. In the last unit game theory and its applications were
discussed. The consequences of interplay of combination of strategies with competitor were
explained and various methods employed to derive the optimal strategy were covered.
Block Assignment
5. For the following Two-person, zero-sum game, find the optimal strategies for the two
players and value of the game:
B’s Strategy
B1 B2 B3
A’s Strategy A1 30 40 -80
A2 0 15 -20
A3 90 20 50
6. Customers for a local bakery arrive randomly following a Poisson process. The single
salesman can attend customers at an average rate of 20 customers per hour, the service
time being distributed exponentially. The man arrival rate of customers is 12 per hour.
Determine the following?
a. The mean number of customers in bakery
b. The mean time spend by customers in bakery
c. The expected number of customers waiting to be queued
d. The mean waiting time of a typical customer in the queue