Professional Documents
Culture Documents
Readings
Albright, S and Winston, W.L, Business Analytics Data Analysis & Decision Making,
(Cengage Learning, 2017) 6th edition [ISBN 9781305947542] Chapter 4.
Probability essentials
A probability is a number between 0 and 1 which measures the likelihood that some event
will occur. An event with probability 0 cannot occur, whereas an event with probability 1 is
certain to occur.
An event with probability greater than 0 and less than 1 involves uncertainty, and the closer
its probability is to 1, the more likely it is to occur.
The simplest probability rule involves the complement of an event. If A is any event, then the
complement of Ac, denoted by (or in some books by 𝐴̅), is the event that A does not occur. If
the probability of A is A(P) then the probability of its complement is given by the equation:
𝑃(𝐴𝑐 ) = 1 − 𝑃(𝐴).
We say that events are mutually exclusive if at most one of them can occur. That is, if one of
them occurs, then none of the others can occur. Events are exhaustive if they exhaust all
possibilities - one of the events must occur. The addition rule of probability involves the
probability that at least one of the events will occur:
𝑃(at least one of 𝐴1 through 𝐴𝑛 ) = 𝑃(𝐴1 ) + 𝑃(𝐴2 ) + ⋯ + 𝑃(𝐴𝑛 ).
Probabilities are always assessed relative to the information currently available. As new
information becomes available, probabilities often change. A formal way to revise
probabilities on the basis of new information is to use conditional probabilities.
Let A and B be any events with probabilities P(A) and P(B). Typically, the probability P(A) is
assessed without knowledge of whether B occurs. However, if you are told that B has
occurred, then the probability of A might change. The new probability of A is called
the conditional probability of A given B.
Conditional probability:
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴 | 𝐵) = .
𝑃(𝐵)
Multiplication rule:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴 | 𝐵) 𝑃(𝐵).
Probabilistic independence means that knowledge of one event is of no value when
assessing the probability of the other. The main advantage to knowing that two events are
independent is that in that case the multiplication rule simplifies to:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) 𝑃(𝐵).
How can you tell whether events are probabilistically independent? Unfortunately, this issue
usually cannot be settled with mathematical arguments; typically, you need empirical data to
decide whether independence is reasonable.
In many situations, outcomes are equally likely (for example, flipping (fair) coins, throwing
(fair) dice, etc.). Many probabilities, particularly in games of chance, can be calculated by
using an equally likely argument. Probabilities calculated in this way satisfy all of the rules of
probability, including the rules we have already discussed. However, many probabilities,
especially those in business situations, cannot be calculated by equally likely
arguments, simply because the possible outcomes are not equally likely.
The variance (or standard deviation) measures the variability in a distribution. The variance,
denoted by 𝜎 2 or Var(X), is a weighted sum of the squared deviations of the possible values
from the mean, where the weights are again the probabilities. The variance is expressed in the
square of the units of X, such as dollars squared.
Therefore, a more natural measure of variability is the standard deviation, denoted by 𝜎
or sd(X). It is the square root of the variance.
Variance of a probability distribution, 𝜎 2 is:
An alternative method for specifying the probability distribution of two random variables X
and Y is as follows.
1. You first identify the possible values of X and the possible values of Y. Let x and y be
any two such values.
2. Next, you directly assess the joint probability of the pair (x, y) and denote it by 𝑃(𝑋 =
𝑥 ∩ 𝑌 = 𝑦) or p(x, y).
3. This is the probability of the joint event that X = x and Y = y both occur. As always,
the joint probabilities must be non-negative and sum to 1.
It indicates not only how X and Y are related, but also how each of X and Y is distributed. In
probability terms, the joint distribution of X and Y determines the marginal distributions of
both X and Y, where each marginal distribution is the probability distribution of a single
random variable.
In the joint probability approach, a whole table of joint probabilities must be assessed. One
approach is to proceed backward. Instead of specifying the joint probabilities and then
deriving the marginal and conditional distributions, you can specify either set of marginal
probabilities and either set of conditional probabilities, and then use these to calculate the
joint probabilities.
Joint probability formula
𝑃(𝑋 = 𝑥 ∩ 𝑌 = 𝑦) = 𝑃(𝑋 = 𝑥 | 𝑌 = 𝑦) 𝑃(𝑌 = 𝑦).
Alternative formula:
𝑌 = 𝑎1 𝑋1 + 𝑎2 𝑋2 + ⋯ + 𝑎𝑛 𝑋.
The variance is not as straightforward. Its value depends on whether the Xs are independent
or dependent. If they are independent, then Var(Y) is a weighted sum of the variances of
the Xs, using the squares of the as as weights:
If the Xs are not independent, the variance of Y is more complex and requires covariances. In
particular, for every pair of Xi and Yi, there is an extra term:
For the sum of independent random variables, we assume the Xs are independent and the
weights are all 1, that is, 𝑌 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 . Therefore, the mean of the sum is the sum
of the means, and the variance of the sum is the sum of the variances:
For the difference between two independent random variables, we assume 𝑋1 and 𝑋2 are
independent and the weights are 𝑎1 = 1 and 𝑎2 = −1, so that Y can be written as 𝑌 = 𝑋1 −
𝑋2. The mean of the difference is the difference between means, but the variance of the
difference is the sum of the variances:
For difference between two dependent random variables, this is the same as the second case
except that the Xs are no longer independent.
For linear functions of a random variable, suppose that Y can be written as 𝑌 = 𝑎 + 𝑏𝑋 for
some constants a and b. In this special case the random variable Y is called a linear function
of the random variable X. The mean, variance and standard deviation of Y can be calculated
from the similar quantities for X with the following formulae:
E(𝑌) = 𝑎 + 𝑏E(𝑋), Var(𝑌) = 𝑏 2 Var(𝑋) and sd(𝑌) = |𝑏|sd(𝑋).
Market return
An investor is concerned with the market return for the coming year, where the market return
is defined as the percentage gain (or loss, if negative) over the year. The data
file Market_return.xlsxshows how to compute the mean, variance, and standard deviation of
the probability distribution of the market return for the coming year.
Note that the probabilities sum up to 1, as they should. Uncertainty is often a key factor, and
you cannot simply ignore it!
Portfolio investments
The data file GM_vs_Gold.xlsx considers the return of General Motors (GM) stock and gold
under different economic conditions. We can use this file to obtain the relevant joint
distribution and use it to calculate the covariance and correlation between returns on the two
given investments, and also to analyse a portfolio containing these two investments.
The scenario approach applies because a given state of the economy defines both GM and
gold returns.
This file also includes an introduction to simulation.
Simulation is an extremely useful tool which can be used to incorporate uncertainty explicitly
into spreadsheet models. A simulation model is the same as a regular spreadsheet model
except that some cells include random quantities. Each time the spreadsheet recalculates, new
values of the random quantities occur, and these typically lead to different bottom-line
results.
The key to simulating random variables is Excel’s RAND function, which generates a
random number between 0 and 1. It has no arguments: =RAND().
Substitute products
A company sells two products which are substitutes for each other. The data
file Substitute_products.xlsx shows the company’s assumed joint probability distribution of
demand for the two products during the coming month.
We can use the given joint probability distribution of demands to find the conditional
distribution of demand for each product, given the demand for the other product, and to
calculate the covariance and correlation between demands for these substitutes.
Let 𝐷1 and 𝐷2 denote the demands for products 1 and 2, respectively. You first find the
marginal distributions of 𝐷1 and 𝐷2 . The marginal distributions indicate that ‘in-between’
values of 𝐷1 or of 𝐷2 are most likely, whereas extreme values in either direction are less
likely.
The joint probabilities spell out this relationship, but they are rather difficult to interpret. A
better way is to calculate the conditional distributions of 𝐷1 given 𝐷2 , or of 𝐷2 given 𝐷1 . The
relationships between the demands are shown graphically.
Portfolio analysis
Consider the data file Portfolio_analysis.xlsx . This can be used to determine the mean annual
return of the portfolio, and to quantify the risk associated with the total dollar return from the
given weighted sum of annual stock returns.
This is a typical weighted sum model. The random variables are annual returns from stocks;
the weights are the dollar amounts invested in stocks.
Return of Return of
Probability
Gamma Delta
X=x 0 1 2 3
a. Find the probability that one customer is in the regular checkout line.
b. Find the probability that no more than one customer is in line.
c. Find the probability that at least two people are in line.
d. Find the probability that three or fewer customers are in line.
e. What is the probability that no one is waiting or being served in the regular checkout
line?
f. What is the probability that three customers are waiting in line?
g. On average, how many customers would you expect to see in line?
a. Find the expected demand (in units) for the upcoming quarter.
b. What is the probability that the demand for this product will be above its mean in the
upcoming quarter?
c. What is the probability that the demand of this product will be below its mean in the
upcoming quarter?
d. What is the probability that the demand for this product exceeds 2500 units in the
upcoming quarter?
e. What is the probability that the demand for this product will be less than 3500 units in
the upcoming quarter?
X=x 0 1 2 3