You are on page 1of 118

15.

060 Data, Models, Decisions


Decisions

Final Review
Review

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Final Exam
Exam

Date: Monday, December 17


Time: 9am-12pm
9am 12pm
Place: See MIT Server (come early!)
Closed book exam
No laptops or communication devic
devices
You can bring a calculator
Formula Sheet will be provided
BUT get a good night
nightss sleep!
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Table of Contents
Contents

Topic 1 : Decision Analysis


Topic 2 : Discrete Random Variables
T i 3: C
Topic
Covariance
ovariance
i
and Correl
orrela
ati
ttition
io
Topic 4 : Continuous Random Variables
Topic 5 : Statistical Sampling
Sampling
Topic 6 : Simulation
Top
pic 7 : Reg
gression
Topic 8 : Linear Optimization
Topic 9 : Nonlinear Optimization
Topic 10 : Discrete Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

You are NOT responsible for:

TOPIC 1:
1:

Decision Analysis
Analysis

Conditional Probabilities

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

TOPIC
PIC
2
2:
2:

Discrete Random Variables


Variables

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Discrete Random Variables


Variables

A probability distribution for a discrete random variable X consists of


(i)

possible values

x1, x2, . . . , xn,

(ii)

corresponding probabilities

p 1, p2, . . . , pn,

so that: P(X = x1) = p1, P(X = x2) = p2, . . . , P(X = xn) = pn .


0 50
0.50

A histogram is a
display of
probabilities as a
bar chart

0.40
0.30
P(Y= y)

0.20
0.10
0.00
0

Probabilities are non-negative, must sum to 1,


The possible values are mutually exclusive
and collectively
y exhaustive ((describe all the p
possibilities that can happen).
pp )
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

3 important measures
measures

1. Expected Value or Mean: (measured in units of X)


Average outcome measure of central tendency

E( X )

PX

P( X

xi )xi

px

i i

2. Variance: (in units of X squared)


2.
Squared deviation around the mean measure of spread

Var(( X ) V X2

P(( X

xi )(
)(xi  P X ) 2

2
p
(
x

P
)
i i X
i

3. Standard Deviation: (in units of X)


M
Measure off spread

VX
December 15 2007

Var( X )

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

You are NOT responsible for:

The Binomial distribution


distribution

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

TOPIC 3:
3
Covariance and Correlation

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Covariance:

Covariance:

Cov( X ,Y )

E[( X  P X )(Y  PY )]

P(( X

xi ; Y

y j )(
)(xi  P X )( y j  PY )

i, j

Measures the extent to which two random variables


vary together.

Correlation:

CORR(X, Y)

COV(X, Y)

X Y

Correlation is unitless
CORR(X, Y) always between -1 and 1.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

10

Wo
W
orking with joint distributions

Suppose X and Y are two random variables with


joint distribution P((X=xi; Y=y
yk):

E[ X ] = xi P( X = xi )
i

Var( X ) = X = (xi X ) 2 P( X = xi )
2

P( X = xi ) = P( X = xi ;Y = yk )
k

Marginal distribution of X
December 15 2007

Joint distribution of X and Y

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

11

Sums of random variables


variables

Mean of a sum:

E(aX  bY  c) = aE( X )  bE(Y )  c


Variance of a sum:

V ( X  bY  c)) = a 2Var(( X )  b 2Var(Y


Var(aX
V (Y )  2
2.a.b.COV ( X ,Y
Y)
2 2
2 2
a
V

b
V Y  2.a.b.V X V Y CORR( X ,Y )
=
X

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

12

TOPIC 4:
4:

Conti
Con
C
ti
tinuous ran
rand
d
dom
om vari
variia
abl
bles

es

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

13

Continuous random variables


variables

A continuous r.v can take any value in some interval


Examp
ple: W Time sp
pent waiting
gin line at Au Bon Pain!
There are an infinite number of possible values that the
random variable can assume
For a continuous random variable
variable, questions are phrased in
terms of a range of values.
NOTE:
You would never say: Probability to wait exactly 10.5 minutes!

P(W=10.5)=0
B t Probability
But:
P b bilit tto wait
it :
Less than 10 minutes: P(W<10);
More than 20 minutes; P((W>20));
Between 10 and 15 minutes: P(10<W<15).

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

14

Density functions
functions

Probability density function:


Denoted

f(t): gives a picture of

the distributio

distribution

(think of a smoothed histogram)

0.35
03
0.3
0.25
0.2
0.15

Area

under the curve between 2

values a and b: P(a

P(a X b)
b)

0.1
0.05
0
0

0.5

1.5

2.5

3.5

4.5

Total

area under the curve = 1

(total probability)

Cumulative density function:

1.2
1

F(t)

= P(X t)

0.8

P(X

t) = 1-F(t)

0.4

0.6

0.2

X b) = P(X b) - P(X a) =
F(b) F(a)
P(a

December 15 2007

0
0

0.5

1.5

2.5

3.5

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

4.5

15

The normal distribution


distribution

0.12

Bell-shaped curve

0.1
0.08
0.06
0.04
0.02
0
-6

-4

-2

10

12

Computing probabilities with the Normal distribution:


You want : P(a X b) where X is N(,)
1. Define : Z = X

: Z is N(0,1)

P(a X b) = P(
= P(Z

) P(Z

Z
a

2 Use the standard normal probabilit


2.
probability table (Z table)
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

16

68.3%

95.4%
PV

PV
PV
.0228

.1587

PV

PV

PV

.5

.8413

.9772

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

17

Sum of i.i.d random variables:


Central Limit Theorem
X1, X2, ..., Xn independent identically distributed random variables:
E[X
[ i] = , Var(X
( i) = 2
For n>30, Sn = X1 + X 2 +...+ X n is approximately normal with
2 (standard deviation n.
mean
n
n.

and
variance
n
n.

mean n and
n
(standard deviation n
n )
Mn =

X 1 + X 2 + ...+ X n
n

For n>30,
is approximately normal with
mean and
d variance
i
2//n
(( tt dd dd deviation
dd ii titi /n
/ )
/ (standard
The probability distribution of Xi does not matter;
n does not have to be very large ( 30 is good enough);
CLT requires only 2 pieces of information:the mean and SD of Xi
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

18

TOPIC
PIC
5
5:
5:

Statistical Sampling
Sampling

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

19

Sample mean of a population


population

Estimator of the mean of a


population (): Sample mean X

X =

X 1 + ... + X
n

Population of size N

where X1,,,X
, n are n R.Vs following
g
the population distribution (unknown
mean , unknown std dev )
Random sample 2:
sample
l mean x2

Random sample 1:
sample mean x1

X is a random variable !
By Central Limit Theorem, if n>30, then X is approximately
normal with mean and standard deviation /n
normal,
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

20

Sample standard deviation


Population of size N
Estimator of the standard
deviation of a population ():
The sample standard
deviation S :

S =

(X

X )2

n 1

Random sample 2:
sample std dev s2

Random sample 1:
sample
l std
d dev
d s1

S2 is an unbiased estimator of the variance, i.e. E[S


[ 2]]=2

S is a random variable !
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

21

Confidence interval for sample mean


mean

How confident are we that X is a good estimate of the true mean of


the population ?
The realized values of X and S in the sample of size n are:x and s
if n>30, then a % confidence interval for the mean is:

c.s
c.s

,x +
x

n
n

What sample size do we need to be sure that the % confidence


i t
interval
l is
i within
i hi +// L off the
h true mean ?

the required sample size is:

c2s2
n= 2
L

c is such that P( -c <Z< c) = %, where Z~N(0,1):

= 90 c = 1.645,,
December 15 2007

= 95 c = 1.960,, = 99 c = 2.576.

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

22

3 types of CI problems

There are 3 main types of confidence interval problems you should


know how to do:
1. Given x , s, n, E% -> find c -> find Confidence Interval [ , ]
2. Given x, s , n , L (or the interval itself [ , ])
-> find c -> find the E % confidence level
3. Design Problem: given E%, s , L -> find c -> find the required
sample size n

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

23

Confidence interval for proportion


proportion

Let X = number of observations in a sample of size n to have a certain


characteristic, p = the actual proportion of the population to have that
characteristic.
X
p=
The sample proportion
n is approximately normally distributed

with mean p, standard deviation

.
p(1 p)

A % confidence interval for p is: p c p (1 p ) ; p + c p (1 p )


n
n

where c is that number for which : P(-c<Z<c)


c<Z<c) = %
%, Z~N(0,1)
Z~N(0 1)
Note: p is unknown:
Option 1: replace it by its estimate p
Option 2: p=1/2 (worst case) because p(1-p) for all p
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

24

TOPIC
PIC
6
6:
6:

Simulation
Simulation

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

25

Some lessons on simulation


simulation

1. Provides more info than average case analysis and simple formulas.
1.
formulas
2. You generate random variables that obey a variety of discrete and continuous
probability distributions (e.g uniform, binomial, etc).
3. The results are not precise
precise, due to the inherent randomness in a simulation.

We typically obtain estimates of the distributions of particular quantities of

interest, means and standard deviations of these distributions.

F
From
these
h
distributions,
di ib i
one can d
derive
i confidence
fid
intervals
i
l and
d other
h inferences
i ff

of statistical sampling.
4. The question of how many trials or runs of a simulation can become a

4.
compl
complex
lex st
stati
ati
tissti
tical
call iissue.
ssue Fortunatel
F t
t ly, with
ith ttod
day''s computi

ting power,

this is not a paramount issue for most problems.

5. In practice, one should recognize that gaining managerial confidence in


5.
a simulation model will depend on at least three factors
factors:
(i) a good understanding of the underlying management problem,

(ii) one's ability to use the concepts of pr


probability
obability and statistics correctly,
correctly
(iii) one's ability to communicate these concepts effectively.
effectively
Decem

So what happens on the final exam? You may get Sampling questions !

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

26

TOPIC
PIC
7
7:
7:

Regression
Regression

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

27

Multiple regression
Explanatory variables :
X1, X2, ,Xk taking values x1i, x2i, . . . ,xki (i = 1, . . . ,n)
Depend
ependent vari
variabl
blee :
taking values yi (i = 1,. . . n)

Model: Yi = 0 + 1x1ii + . . . + kxki + i


1, 2, . . . , n are iid random variables, N(0, )
Goal: Choose b0, b1, . . . , bk to minimize the residual sum of squares

y i = b0 + b1x1i + . . . + bkxki ,
n

Mi
Mini
nimize

i =1

December 15 2007

e 2i =

ei = yi -

y i

(y i y i )

i =1

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

28

Regression Output
Output

1)

Regr
Regression
ession coefficients
coefficients:: b0, b1, . . . , bk sample estimates of E0, E1, . . . , Ek

2)

Standard error : estimate of V

a measure off the


h amount off noiise in
i th
he model
d l
3)

Standard errors
errors of the coefficients:
coefficients: sb0 , sb1 , . . . , sbk

same role as the estimate of the standard deviation of the sample mean in
sampling
bm  E
m
Prior to observing bm and sbm,
has t-dist. with (n - k - 1) d.o.f.
sbm
s

Degr
Degrees
ees of fr
freedom
eedom:: n - (k + 1) = n - k - 1

n pieces of data;

used up (k + 1)) degrees


g
of freedom to estimate b0,, b1,, . . . , bk

used to test the existence of a linear relationship between Y and xm;


+
What is 95% confidence interval for Em?
+
Does the interval contain 0?
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

29

4) Significance test: Is m significantly different from zero?


The % confidence interval for m is:

(bm - c sbm, bm + c sbm),


where c is such that : P( -c < T < c) = /100.
Steps to finding the Confidence
Confidence Interval:
1) d.o.f. = n k 1
2) using % and d.o.f. , find c on the t-table

3) using c, bm , sbm write the interval above.

If zero does not lie in the confidence interval we are confident at the %
level that m is different from 0.
If zero lies in the confidence interval, then m is not significantly different
from zero:
we should be skeptical
p
that Y depends
p
linearly
y on xm and we might
g want to
eliminate xm from the model.
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

30

Coefficient of determination
determination::

6)

(y i y i )

( y is sample mean of yis.)

Variation not accounted for by x variables


Total variation
Variation that is accounted for by x variables
=
Total variation
= 1

R 2 = 1 i =n1

(y i y )2

i =1

R2 takes values between 0 and 1:


30

35
30
25
20
15
10
5
0

25
20
15
10
5
0
0

10

15

20

25

30

10

15

20

25

30

R2 = 1; x values completely account


for Y values

R2 = 0; x values account for none of


the variation in the Y values

A good value of R2 depends on the situation


December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

31

Checklist for evaluating linear regression models


Linearity: If there is only one explanatory variable, construct a scatter-plot of the
data to check for linearity. Otherwise, use common sense to decide if a linear
relationship is reasonable. (Rule of thumb for choosing no of factors n > 5(k + 2) )
Signs of Regression Coefficients: Check to see that the signs make intuitive sense

Significance tests: check if the regression coeffs are significantly different from zero
R2: Check if the value of R2 is reasonably high.
Normality: Check that the residuals are approximately Normally distributed by

constructing a histogram of residuals.

Heter
Heteroscedasticity:
oscedasticity: Do error terms have constant standard deviation?
Plot the residuals with the observed values of each of the explanatory variables.
Autocorrelation: Are error terms independent? If data are time-dependent,

plot the residuals over time to check for any apparent patterns.

Multicollinearity: Are two explanatory variables correlated?


Signs: if regression coeffs have wrong sign or we find high R2 but one or more
of the regression coeffs is not significantly different from 0.
Look at the correlation matrix. Large positive or negative correlations between the
explanatory variables are bad.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Heteroscedasticity
20.00

20.00

10.00

10.00

0.00
-10.00

00
0.0

10
1.0

0.00

20
2.0

-10.00

0.0

1.0

2.0

-20.00

-20.00
Advertising Expenditures

Advertising Expenditures

No Evidence of Heteroscedasticity

Evidence of Heteroscedasticity

Autocorrelation
20.00

20.00

10.00

10.00

0.00

0.00
-10.00

10

15

10

15

-20.00

-20.00
i

No evidence of Autocorrelation
December 15 2007

10 00
-10.00

Evidence of Autocorrelation

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

33

Important regression issues you should know

Know how to interpret the regression output. Explain in English what


the coefficients mean and give intuition about how they affect the
dep
pendent variable.
Know how to build the confidence intervals for the coefficients using
the t-table.
Know how to read and interpret the regression graph and the output
output

residual graphs (histogram, autocorrelation, heteroscedasticity)


Know how to improve your model

Checkk the
Ch
h signs,
i
significance,
i ifi
correlation,
l i
etc which
hi h variables
i bl
to
add and drop (explaining why)

Check linearity: if it fails, can you modify your data to make a


b tt mod
better
del.l Exampl
E
le: makke a pollynomiiall

Dummy variables: you need to know how to model categorical


data. Example: beer bottles red x green.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

34

TOPIC
PIC
8
8:
8:

Linear Optimization
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

35

Optimization terminology
terminology

Decision Variable:
Variable:

Describes a decision that needs


to be made, e.g. how many items
to produce.

Objective Function:
Functio : An expression (in terms of the
Function:
Function
variables) that needs to be
minimized or maximized.
Constraint:
Constraint:
December 15 2007

An expression that restricts the


values of the variables.
variable
DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

36

Steps in formulation
formulation

1.
1.

Define the decision variables.

2.
2. Write the objective as a function of these vars.
Determine whether max or min
min..
3. Write the constraints as functions of these vars.
Either , , = .
4.
4. Determine the variable restrictions,
e.g. non-negative, integer.

Be careful of units!
units!

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

37

A Fundamental Point

y
4

40

30

20

10

0
0

0
0

0
0

10

20

x
30

40

If an optimal solution exists, there is


always a corner point optimal solution!
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

38

About Shadow Prices


Prices

Associated with each constraint is a shadow price. (=0 for non


binding constraints)

The shadow pri


Th
price is
i the
th change
h
in
i the
th objective
bj ti value
l per unit
it
change in the right hand side, given all other data remain the
same.

Associated with each shadow price is a range over which this


shadow price holds.
h

If r.
r.h.s changes within range: current solution remains optimal,
shadow price tells us rate of change in the optimal objective
function value;;

If r.
r.h.s changes outside range: current solution is not optimal
anymore; we need to solve the optimization pb again !

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

39

Avoid frequent mistakes!


mistakes!

Forgetting the non-negativity restrictions


Confusing Maximizing with Minimizing
Inconsistent and/or incorrect units
Reversinggth e signs
s g of the constraints
Wrong interpretation of the shadow prices.
prices
Change in R.H.S outside the allowable range
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

40

TOPIC
PIC
9
9:
9:

Nonlinear Optimization
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

41

Some possible cases


objective
function level

objective
function level

optimal solution
Feasible
Region
g

Multiple optimal
solutions
Feasible
Region
g

objective
function level

linear objective,
nonlinear constraints

Corner solution

nonlinear objective,
linear constraints

objective
function level

objective
ffunction
ti level
l
l

optimal solution
Feasible
Region
December 15 2007
nonlinear objective,
nonlinear constraints

optimal solution
Feasible
Region
DMD Fall 07 Final Review
nonlinear objective,
linear constraints

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

42

Local vs global solutions


solutions

Op
O
ptima
ti
t mall soluti
lution:
l tion A feasibl
easible
ible sollution
t
that
h optiimiizes
the objective value among all feasible points.
Local optimal solution
solution: A feasible solution that optimizes
the objective value among all feasible points near it
Example:
Minimization in one variable over 2 <= x <= 7
Computer software for NLP can efficiently find local opt.
BUT! thi
this solluti
tion will
ill nott necessaril
ily b
be th
the global
globa
l b l opt.
t

f(x)
x = 2

is a local optimal solution.


solution

x = 3.5 is a local optimal solution.


x = 5
December 15 2007
2

5 6

is the global optimal solution

DMD
Fall 07 x
Final Review
7

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

43

Shadow prices in NLP


NLP

Review:
Shadow price of a constraint for LP:
LP:

Incremental change in the optimal objective function value


value

per unit increase


increase in the right
right--handhand-side (RHS) of the constraint.
constraint.

Shadow price of a constraint for NLP: (Lagrangian multiplier)


multiplier)

Approximate Incremental change in optimal objective function


ith small change in the RHS
RHS.
value with
Binding
Binding constraint :
when satisfied as equality at the optimum.
For nonbinding constraints,
constraints, shadow prices ar
aree zero!
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

44

TOPIC
TOPIC 10

10:
10:

Discrete Optimization
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

45

Discrete optimization
optimization

Feasible region is a set

of discrete poin
points.

y
4

Cant be assured a

corner point or ev
even
boundary solution.

Not as easy
easy to solve

as LP.
1

Solving it as an L
LP

provides a relaxation
and a bound on the
solution.
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

46

Modeling issues
issues

Decision variables are restricted to take only integer values


Great modeling flexibility using binary variables
xi = 1 , if event i occurs

xi = 0 ,

otherwise

Strategic planning (number of people to hire)


Allocation of resources (which project to fund)
Determination of productivity and distribution
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

47

More on modeling issues


issues

If x1 = 0 then x2 = 0

x2 x1

If x1 = 1 then x2 = 1

x2 x1

If x1 = 1then x2 = 1 and vice versa

x2 = x1

If x1 = 1 then x2 = 1 or x3 = 1

x1 x2 + x3
10

Invest in at most 2 projects


projects

xx
d
2
2

i 1

Select 5 out of 10 projects

10

i 1

Key concept: Analyze logical implication of constraint in all


possible
ibl cases
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

48

Partial taxonomy of optimization

Nonlinear
Optimization

Linear Optimization
objective and constraints are
linear expressions
Integer
Optimization variables

objective and/or constraints are


nonnon-linear expressions

are restricted to discrete


(integer) values

Mixed - Integer
Optimization
some variables are
continuous, some are
discrete
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

49

P bl
Problems
ffrom 2005 fi
final

final

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

50

Problem 1: True
True or False

False
(a) If the 95% confidence interval for the sample mean extends from 4 to
14 based on a random sample of size 60, then the sample mea
was 9.
Interval is centered around the sample mean:

TRUE

x-L =4

x+L=14
Midpoint:

Midpoint:
x =(4+14)/2

= 9

(b) If R2 = 0, it means that all the data points in an yy-vs-x


vs x regression
model must fall along the horizontal line

FALSE
y

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

51

Problem 1: True
True or False

False
(c) A resident of Boston is chosen at random. Consider the 2 events:

I.

The person selected is a lawyer;

II.

The person selected is a lawyer and an environmental activist.

The probability of event II can never exceed that of event I.

Environmental
activists

TRUE

lawyers

Lawyers and environmental


activists

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

52

Problem 1: True
True or False

False
d) If X has mean 1, standard deviation 2 and Y has mean 1, standard
deviation 4, then the standard deviation of Z=X+Y cannot exceed 6.
Var(Z) = Var(X) + Var(Y) + 2*X*Y*CORR(X,Y)
Max when CORR(X,Y) = 1 Var(Z) = 36, Z = 6

TRUE

e) Mendel asks a random number generator to create 10,000


independent selections from a N(0,1) distribution. The 10,000
selections turn out to have a sample mean of 0.0
0.08.

Assuming the random generator to work properly, the chance would

be less than 1% that the sample mean would fall at least as far as it

did from the true mean.

TRUE

n = 10,000, x = 0.08. By CLT X~N(0,1/n) (approximately).

P(X 0.08) = P(Z (0.08 0)/(1/100)) = P(Z 8) ~ 0 (8 standard

deviations from the mean)

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

53

Problem 2 (a)
(a)

John has not been feeling well recently and he believes he has a bacterial
infection with probability 0.6.
He takes a test that is 99% reliable:
The

probability that the test is positive given that he has an infection is

99%;
The

probability that the test is negative given that he does not have an
inffectition is 99%.
99%
If the test result is positive, what is the probability that he has an infection?

P(INF) = 0
0.6
6
P(test+ | INF) = 0.99

P(!INF) = 0
0.4

4
P(test- | !INF) = 0.99

We want : P(INF | test+)


P(INF | test+) = P(INF and test+)

P(test+ | INF) = P(INF and test+)

P(test+)
December 15 2007

P(INF)

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

54

Problem 2 (a)
(a)

P(INF) = 0.6
P(test+ | INF) = 0.99
We want : P(INF | test+)

P(!INF) = 0.4
P(test- | !INF) = 0.99
P(INF | test+) = P(INF and test+)
P(test+)

P(test+
P(t
t+ | INF) = P(INF and
d ttest+)
t+)

P(t t+ | !INF) = P(!INF and


P(test+
d test+)
t t+)

P(INF)
P(INF and test+) = 0
0.99
99*0
0.6
6
= 0.594

P(!INF)
P(!INF and test+) = 0
0.01
01*0
0.4
4
= 0.004

P(test+) = P(INF and test+ ) + P(!INF and test+)

= 0.594 + 0.004 = 0.598

P( INF | test+) = 0.594/0.598 = 0.99


December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

55

Problem 2 (b)
(b)

Statistics show that the number of years a CEO spends in office is

normally distributed with mean 5.5 and standard deviation 1.2.

Given that a CEO has been in office for exactly 5 years so far
far, what is
is

the probability that she will still be in office 2 years from now?
X: # years in office : X~N((5.5,,1.2 )

Office tenure: t=0


t 0

now t=5
t 5

tt=7
7

We want : P(X 7 | X 5)
P( X 7 | X 5) = P(X 7 and X 5)

= P(X 7) = 1 - P(X 7)
7)

P(X 5)

P(X 5) 1- P(X 5)
Z~N(0,1):
P( X 7) = P(Z (7
(7-5.5)/1.2)
5.5)/1.2) = P(Z 1.25) = 0.8944

0.8944
Look up in Z-table!
P( X 5) = P(Z -0.417) = 0.3372

P(( X 7 | X 5)) = (1
( - 0.8944)/(1
) ( - 0.3372)) = 0.1593
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

56

Problem 3
3

In a random poll of 100 randomly-selected business leaders, 77% say


that they support Bernanke as new chairman of the Fed.
(a) What is the 99% confidence interval for the percentage of al
all
business leaders who support Bernanke;
n = 100,, p = 77%

99% confidence interval for the sample proportion:

p (1 p )
p (1 p )
; p+c
p c

n
n

Where c is that number for which P( -c


c < Z < c) = 99%, Z~N(0,1)
Z N(0,1)
i.e c = 2.576
99% confidence interval: [0.66; 0.88]
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

57

Problem 3
3

(b) Ezekiel- who has not seen the results of the poll- wants to find a
95% confidence interval for the percentage of business leaders who
supp
pport Bernanke. He also wants the interval to extend no more
than one percentage point in each direction around its midpoin
Make a sensible estimate of the number of business leaders he
should poll.

c2
4L2

Where L = 1%, and c is that number for which:

),i.e c = 1.960
P((-c<Z<c ) = 95%,,Z~N (0,,1),

n = 9,604 (round up non-integer values!)

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

58

Problem 4
4

Mendel performs a linear regression analysis on the unemployment


rate in Massachusetts (UM) versus the current wholesale price of fuel
oil per gallon (P) in Massachusetts in inflation-adjusted dollars.
Using
U
i monthly
hl d
data ffor a recent six-year
i
period
i d (i
(i.e., using
i 72
observations), he reaches the least squares equation:
UM = 2.10 + 3.00P (P is in dollars and UM in per cent.)
The R^2 value for the regression is 0.66, and the upper end for the
95% confidence interval for the slope of P is 5.00. The sample
standard deviation of the monthly Massachusetts unemployme
unemployment
rates over the six-year period studied was 1.00 percent.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

59

Problem 4 (a)(a)-(c)

(a) If fuel oil is projected to cost $1.30 in a forthcoming month, what


is the estimate of the Massachusetts unemployment rate for that
month based on the regression result?
Um = 2.1 + 3 * (1.3) = 6%
(b) Does the 95% confidence interval for the slope of P include 0?

NO: CI is symmetric around mean 3 and upper bound is 5 [1, 5]

(c) What is the sum of squared residuals of the 72 data points


around the regression line?

Decem

60

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Problem 4 (d)
(d)

(d) Consider one at a time the following possible patterns among the

residuals for this regression analysis. Briefly explain for each pattern

wh
heth
ther, b
by ititself
lf, it would
ld sub
bsttanti
tialllly red
duce your confid
fidence iin th

the
regression analysis:
II. The heavy majority of the residuals in the first three years studied
were positive, while the heavy majority of those in the second three
years were negative.
Autocorrelation: residuals are not casual but follow a time-based
time based

pattern.

(Another acceptable answer would be that the relationship might be

nonlinear.))

II. The residuals are consistently larger in the months when the fuel
prices are high than in those in which prices are low.
Heteroscedasticity: the residuals consistently get larger with larger
values of the independent variable P.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

61

Problem 4 (contd)
(contd)

Fearful of an omitted variable in the regression above, Mendel

performs another linear regression on the same data. For each

month,
th th
the d
dependent
d t variable
i bl iis still
till UM,
UM while
hil the
th variables
i bl on the
th
right are P and UN, the average unemployment rate in the other 49
American states. He reaches the revised regression equation:
UM = 1.50 + 2.00P + 0.50UN

R 2 for the revised regression was .75,


R^2
75 while the upper ends of the

the
95% confidence intervals are 6.00 for the slope of P and 1.10 for the
slope of UN.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

62

Problem 4 (e)(e)-(f)

(e) Do the regression results provide statistically convincing evidence


that UN really belongs in the regression model? Briefly discuss.
NO: both 95% CIs contain 0: P [ -2, 6 ] and Un [-.1, 1.1]
(f) Suppose that UN and P exhibited strong positive correlation over
the six years studied. What general problem in regression analysis
might result from that circumstance? How might that problem have
affected the regression results?
Multi-collinearity: the independent variables are highly correlated
among
g themselves. This may
y neg
gatively
y affect the statistical
significance of both variables (like in this case).

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

63

Problem 5
5

Recall the Filatoi Riuniti case and linear optimization model, where the firm
would like to determine its monthly outsourcing strategy for spun yarn among six
other spinning mills as well as their own internal production strategy for spun
yarn.
The objective function is to minimize the variable cost (including transportation
cost) for meeting demand for the four spun yarn sizes (Extrafine, Fine, Medium,
and Coarse).
There are four types of constraints in the model:
mode
1. Filatoi must meet monthly demand for each of the four spun yarn sizes.
2. None of the seven mills can exceed their monthly production capacity.
3. Neither Ambrosi nor De Blasi can produce Extrafine yarn.
4 All d
4.
deciisiion variiables
bl must b
be nonnegatiive.
Suppose that demand for spun yarns is the same as in the original case, as are
the production capacities and machine hour requirements
requirements.
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

64

Problem 5
5

Suppose, however, that over time the variable production and transportation
costs have changed, and that the current data for Filatoi Riunitis production
problem for the coming month of January are shown in Table 1 below
below.

Decem
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

65

Problem 5
5

Roberto Cominetti has re-run the linear optimization model using this
new data, resulting in the optimal solution shown in Table 2 along with the
Sensitivity Report shown in Table 3
3. Please answer the following questions
based on the linear optimization model solution and Sensitivity Report.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

66

Problem 5
5

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

67

Problem 5 (a)(a)-(b)

(a) What are binding constraints in the model? In the optimal plan for the coming
month, which spinning mills would use all of their spinning capacity to produce
spun yarn for Filatoi Riuniti?
All the constraints are binding, except for Capacity at Giuliani. This is the only
mill that has not its capacity fulfilled under the optimal strategy.
(b) What would be the cost impact of increasing the required production
of Extrafine yarn from 25,000 kg to 27,000 kg? What can you say, if anything,
about the cost impact of increasing the required production of Extrafine yarn
from 25,000 kg to 29,000 kg?
Shadow Price = 18.397 ($/kg). Max increment allowed +3,197.5 Kg
Additi
Additional
lC
Costs = +2,000
2 000 K
Kg * $18
$18.397/Kg
397/K = $36
$36,794
794 / month
h
Max Additional Costs = +3,197.5 Kg * $18.397/Kg = $58,824.4 (for additional
3,197.5 Kg). Nothing can be said for the remaining 802.5 Kg except that they
would cost at least $18.397/kg.
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

68

Problem 5 (c)(c)-(d)
(c) Another local spinning mill by the name of Havarti has informed Filatoi
that they can produce Fine spun yarn for Filatoi for a delivered cost of
$14 25/kg Should Filatoi consider entering into an agreement with Havarti to
$14.25/kg.
produce Fine spun yarn at this price?
NO: The shadow price for the demand of fine is 14.018. Hence, if we were to
produce less fine yarn with the current machines and outsource it to Havarti, we
would save 14.018 per Kg, and the extra cost would be 14.25 per Kg, so it is not
worth it.
(d) According to the models data, monthly capacity at De Blasi is 2,600
spinning machine hours. However, Filatoi Riuniti has just received an email from
the outsourcing manager at De Blasi indicating that capacity for the coming
month will be curtailed to 2,200 spinning machine hours due to some
unanticipated machine maintenance. How much will this change the total
variable cost of producing and/or outsourcing spun yarn in the coming month?
Shadow Price = -.086 ($/hour)
Additional costs = (-400 hours) * (-$.086/hour) = $34.4
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Problem 5 (e)
(e)

(e) How much do you think Giuliani would have to reduce the price they charge
Filatoi Riuniti for Fine spun yarn in order for Filatoi to want to discuss
outsourcing production of Fine spun yarn to them?
The shadow price for fine yarn is $14.02. De Blasi would have to reduce their
price below this level.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

70

Problem 6
6

Forest Capital (FC) has decided to appoint Sarah Edwards as the new portfolio
manager of its portfolio of technology and utility stocks in emerging markets,
which is currently comprised of various amounts in ten different companies.
compani
Table 4 below shows the current portfolio weights, the latest annualized
expected return and standard deviation estimates, and the classifications of
each of the ten companies.

December 15 2007
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

71

Problem 6
6

The estimated correlations among the returns of the ten companies are shown
in Table 5. Note in Table 5 that FC assumes for simplicity that returns among
stocks are uncorrelated except among stocks A
A, B
B, and C.
C

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

72

Problem 6
6

Sarah has decided to use an optimization model to select the new weights of
the portfolio for the coming month. She would like to maximize the expected
return of the portfolio subject to the following constraint
constraints:
1. The standard deviation of the resulting portfolio should be at most 8%.
2. The amount of turnover of the portfolio should be at most 30%. As an
example of how turnover is calculated, if prior to trading a portfolio has 70%
of its funds in Stock 1 and 30% in Stock 2, and after the trade the weights are
60% for Stock 1 and 40% for stock 2,
2 the turnover of the portfolio is (|70
(|70
60)|+ |30-40|)= 20%.
3. Last month, the total portfolio weight in technology stocks was
0 08+0
0.08
0.07
07+0
0.17
17+0
0.09
09+0
0.08
08 = 0
0.49
49 = 49%
49%. S
Sarah
h would
ld lik
like to maiintaiin th
the
character of the portfolio as a balanced portfolio between technology and
utility stocks. For this reason, she would like the total weight of the portfolio
in technology stocks to be between 45% and 55%.
4. All portfolio weights need to be nonnegative. That is, short positions are not
allowed in the portfolio.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Problem 6 (a)
(a)

(a) Write down a formulation of a nonlinear optimization model to determine the


new weights of the portfolio.
wi = fraction of the resulting portfolio invested in stock i
Obj MAX (.08w1 + .12w2 + .15w3 + .11w4 + +.08w9 + .05w10)

Subject to:

w1 + w2 + w3 + w4 + + w9 + w10 = 1 (fractions)

[(.13)2(w1)2 + (.25)2(w2)2 + (.35)2(w3)2 + + (.07)2(w10)2 + 2(.13)(.25)(.4)(w1)(w2)


+ 2(
2(.13)(.35)(-.1)(w
13)( 35)( 1)(w1)(w3) + 2(
2(.25)(.35)(.1)(w
25)( 35)( 1)(w2)(w3)]1/2 .08
08
[|w1 - .12| + |w2 - .08| + |w3 - .07| + + |w10 - .08|] .3
w2 + w3 + w5 + w7 + w8 .55
55
w2 + w3 + w5 + w7 + w8 .45
wi 0 (for each i from 1 to 10)
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

74

Problem 6 (b)
(b)

(b) Suppose that in order to trim the rather excessive transaction costs in
emerging markets
markets, Sarah would like to limit her portfolio to stocks in only six
different companies. How would you augment your formulation of the model
using binary variables to incorporate this requirement into the model?

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

75

Problem 6 (c)
(c)

(c) Suppose that Sarah would like to limit the number of trades to at most

seven of the ten companies


companies. (Note that if a stocks
stock s weight does not change

change, it
does not produce a trade.) By defining binary variables, describe how you
would augment your model to incorporate this additional requirement as well.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

76

Good luck !!

There are things MBAs cant solve

For everything else, there is DMD !

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

77

Additional Practice Problems


Problems

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

78

TOPIC 2:
2:

Discrete Random Variables


Variables

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

79

The Beer and Coke Example

Beer and
d Coke
k daily
d il salles att a soccer sttadium

di
Probability

X:# of Beer Cans

Y:# of Coke Cans

pi

xi

yi

0.15
0.27
0.15
0.26
0.17

35
78
81
30
16

41
10
0
13
42

S
Scatter
Pl
Plot of DailySales
D il S l off Beer
B and
d Coke
Ck
Coke Sa
ales

pi = P(X=xi and Y=yi)


50
40
30
20
10
0

0
December 15 2007

20

40

60

80

Beer Sales

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

100

S
Some
Q
Questions
ti

What
Wh t is
i the
th expected
t d numb
ber off beer
b
cans sold?
ld? off cokke cans sold?
ld?
What is the standard deviation of beer cans sold? of coke cans?

What is the covariance and the correlation of beer and coke cans sold?

What is the expected daily revenue?


What is the standard deviation of the daily revenue?
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

81

Some Questions
The
Th expected
d numb
ber off b
beer cans sold
ld iis
Px= E(X) = 6i p(X=xi)xi
The variance of beer cans sold is
V2x=VAR(X)=6p(X=xi)(xi - Px)2

Here it turns out that :


P(X i)=p
P(X=x
) i, but
b t usually:
ll
P(X=xi)=j P(X=xi; Y=yj)

The standard deviation of beer cans sold is

is
Vx= VAR(X)
Summary of Daily Beer Sales

Prob.
pi
0.15
0 27
0.27
0.15
0.26
0.17

# Beer
Cans
xi
35
78
81
30
16

pi xi
5.25
21 06
21.06
12.15
7.80
2.72

Summary of Daily Coke Sales

pi ( xi - E(X))
29.32
227
227.38
38
153.79
93.66
184.91

Prob.
pi
0.15
0 27
0.27
0.15
0.26
0.17

E(X)= 48.98 VAR(X)=689.06


StdDev(X)=26.25Fall 07 Final Review

# Coke
Cans
yi
41
10
10
0
13
42

pi yi
6.15
22.70
2
70
0.00
3.38
7.14

pi ( yi - E(Y))
70.18
23 71
23.71
56.28
10.55
87.06

E(Y)= 19.37 VAR(Y)=247.77


StdDev(Y)=15.74

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Scatter Plot of Daily Sales of Beer and Coke

The covariance of beer and coke cans sold is


COV(X,Y)=6 pi(xi - Px) (yi - Py)

Coke Sa
ales

Some Questions
50
40
30
20
10
0

20

40

60

80

Beer Sales

The correlation of beer and coke cans sold is


CORR(X,Y) = COV(X,Y)/( Vx Vy)
S u m m a ry o f D aily B e er a n d C o ke S ale s
P ro b .
pi

N um ber of
B ee r C an s
xi

N um ber of
C o k e C an s
yi

p i ( x i E (X )) (y i - E (Y ))

0 .1 5
0 .2 7
0.15
0 .2 6
0 .1 7

35
78
81
30
16

41
10
0
13
42

-4 5 .3 6
-7 3 .4 2
-9
9 3 .0 3
3 1 .4 3
-1 2 6 .8 8

E (X )= 48.98
E (Y )= 19.37

C O V (X,Y ) = -3 0 7 .2 5
December 15 2007

C o rre lation = -0.7 4

DMD Fall 07 Final Review

S td D ev(X )= 26.25
S td D ev(Y )= 15.74

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

100

Some More Questions About Beer and Coke


X= number of cans of beer sold; Y= number of cans of coke sold
Revenues : $3 per can of beer, $2 per can of coke
Daily revenue (in $) = 3X+2Y

What is the expected daily revenue?


E( 3 X + 2 Y ) = 3 E(X) + 2 E(Y)
= $3 * 48.98 + $2 * 19.37
= $185
68
$185.68

E(X)=48.98
E(Y)=19.37

What is the standard deviation of the daily revenue?

StdDev(X)=26.25
StdDev(Y)=15.74
Cov(X,Y)= - 307.25

( ) + 22 * VAR(Y)
( )
VAR(( 3 X + 2 Y ) = 32 * VAR(X)
+ 2 * 3 * 2 * COV(X,Y)
= 9 * 689+ 4 * 248+ 12 * ((-307))
= 3509
December 15 2007
Fall 07 Final Review
STD DEV(3X+2Y) = DMD
3509
= $59.23

84

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

TOPIC 4:
4:

4:

Continuous Random Variables


Variables

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

85

The amazon.com example

The time, in minutes, spent surfing amazon.com last


month by people in this auditorium is normally distributed:
X N(170 10)
X~N(170,10)
What if we triple the time spent of a randomly chosen student?
Y=3X
Y=3x=3(170) = 510
Y2=Var(3X)=32x2=9(100) = 900, Y=30
L
Lets
t ttake
k 3 iindependent
d
d t students
t d t att random
d
and
d combine
bi
the time they spent on amazon.com last month
Y X1+X2+X3
Y=X
December 15 2007

Y=3x=3(170) = 510
DMD Fall +X
07 Final
Review
86
Y2=Var(X
1
2+X3) =3(100)=300, Y= 17.32

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Th amazon.com example
l
The

What is the probability that a randomly selected student has


spent between 160 and 180 minutes on amazon.com last
month
th ?
X~N(170,10)
P[[160 <X < 180]] = ?
P [(160-170)/10 < (X-P)/V<(180-170)/10]=
P [[-1<Z<1]
1<Z<1] = F(1)
F(1)-F(
F(-1)
1) = 00.8413-0.1587=0.6826!
8413 0 1587=0 6826!
What is the probabilityythat three inde
indeppendent students

together have spent more than 460 minutes?

minutes
Y=X1+X2+X3 ~ N(510, 17.32)

P(Y>460)=P(Y-510/17.32>460-510/17.32)=P(Z> -2.89)
December 15 2007
DMD Fall 07 Final Review
1-P(Z<-2.89) = 1-0.0019 = 0.9981!
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

87

TOPIC 5:
5:

Statistical Samplin
Sampling
Sampling

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

88

Annual income example


After having managed to successfully survey 100 families we have
found that the observed sample mean of the annual income is
$19,763 while the observed sample standard deviation is $4,000.
What is the distribution of the sample mean
(i l di the
(including
h fform off the
h di
distribution,
ib i iits mean and
d standard
d d
deviation)?

a)

The sample mean X follows a normal distribution


with mean P and standard deviation V/n : N(P, V/n)

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

89

Annual income example


b) What

is the probability that the sample mean will be within $784


of the population mean?
mea
_
P((- $784 < X - P <$_784))

= P( -$784/(V/n) < (X - P /(V/n) < $784 /(V/n) )

~ P( -$784/(s/n) < Z < $784 /(s/n) )

= P( -$784/(4000/
$784/(4000/100)
100) < Z < $784 /(4000/

/(4000/100)
100) )

)
= P(-1.96 < Z < 1.96)

= P(Z < 1.96) - P(Z < -1.96)

= 0.975 - 0.025

= 0.95

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

90

Annual income example


c) What

is a number L such that the probability that the sample


mean is within L of the population mean is 99%
99 ?
A 99% confidence
interval
for the sample mean is given by:

_
_
[x - c*s/n
s/n , x + c*s/
c s/n]
n]
_
(where c = 2.576 (E=99%), s= 4000, n= 100,and x = 19,763)
Therefore L=c
Therefore,
L=c*s/
s/ n
n =1030
1030.44
So the 99% confidence interval is given by:

[19 763-22,576
[19,763
576*4000/100
4000/100, 19,763+2,576
19 763+2 576*4000/100]

4000/100]

= [18,732.6, 20,793.4]

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

91

Annual income example


d) How

many families should we successfully survey so that the


probability that the sample mean is within $200 of the
population mean is 95% ?
To construct
T
t t a 95% interval
i t
l that
th t is
i within
ithi $200 off the
th
population mean, the required sample size n is given by:
n = c2s2/L2
= 1.962 * 40002 / 2002

= 1536.64 ~ 1537

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

92

The Room Service Example


A hotel manager would like to find out the mean time guests have to wait for

room service. For a sample of 45 guests the observed sample mean turned

out to be 32 minutes while the observed standard deviation 11


11 minutes.
minutes
minutes.

What is the 95% confidence interval for the mean time guests have to
wait for room service?

We assume the mean time guests have to wait for room service is

approximately Normal.

A_95% confidence
interval for the mean time guests have to wait is given by:
_
[ x-c*s/n, x+ c*s/n ]
_

where c = 1.96 (E=95%), s = 11, and x = 32

So the 95% confidence interval is given by:


[ 32-1.96*11/ 45, 32+1.96*11/ 45 ] = [28.79, 35.21]
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

93

TOPIC 7:
7:

Regression
Regression

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

94

An Ice Cr
Cream
eam Example
The fat content in a gallon of chocolate ice cream is believed to depend on

Cream,
Cream, Chocolate and Sugar according to:
Fat =A +B*Cr
+B*Cream
eam +C*Chocolate +D*Sugar
A multiple regression was run on data from 20 differ
differeent batches of chocolate ice cream:
R Square:

0.8433

Standard Error:

13.73

Observations:

20
df

Regression:

Residual:

16

Total:

19
Coefficients Standard Error

Intercept

t-Stat.

Lower 95%

Upper 95%

-8.94

19.95

-0.45

-51.24

33.35

Cream (ounces)

0.93

0.12

7.80

0.67

1.18

Choc. (ounces)

2.07

0.60

???

???

???

December 15 2007
Sugar
(ounces)

2.47

DMD Fall 07 Final Review

1.33

1.86

- 0.34

5.29 95

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

An Ice Cr
Cream
eam Example
Correlation between different variables:
Fat (g
(gm))

Cream ((ounces))

Fat (gm)

Cream (ounces)

0.769

Choc (ounces)
Choc.

00.486
486

0.025
0 025

Sugar (ounces)

0.280

-0.099

Choc. ((ounces))

Sugar
g ((ounces))

1
0.409

Compute the 95% CI for Choc. coefficient


Critique
itique mod
model

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

96

An Ice Cr
Cream
eam Example
Compute the 95% CI for Choc. Coefficient
Interceptth

Coefficients Standard Error

t-Stat.

Lower 95%

Upper 95%

-8.94

19.95

-0.45

-51.24

33.35

Cream (ounces)

0.93

0.12

7.80

0.67

1.18

Choc. (ounces)

2.07

0.60

???

???

???

Sugar (ounces)

2.47

1.33

1.86

- 0.34

5.29

1<Z<1] = F(1)

1) = 0 8413 0 1587=0 6826!

The 95% confidence interval for the Choc. coefficient


using c=2.120 from the T-table, will be:
[2.07 2.120*0.60, 2.07 + 2.120* 0.60]
= [0.798,
[0 798 33.342]
342]

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

97

An Ice Cr
Cream
eam Example

Critique model
Signs of Regression
Regression Coefficients
Coefficients Standard Error
p
Intercept

t-Stat.

Lower 95%

Upper 95%

-8.94

19.95

-0.45

-51.24

33.35

Cream (ounces)

0.93

0.12

7.80

0.67

1.18

Choc. (ounces)

2.07

0.60

3.45

0.80

3.34

S
Sugar
((ounces))

22.47
47

11.33
33

11.86
86

- 0.34
0 34

55.29
29

The coefficients for Cream, Choc and Sugar appear to make sense.
Significance test:
0 is in the confidence interval for Sugar coeff.
so Sugar should be
excluded from the regression
regression.
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

98

R2 - The value for R2 is 0.8433 which indicates that the model has a high level
of prediction.
Multicollinearity:
Fat (gm)

Cream (ounces)

Fat (gm)

Cream (ounces)

0.769

Choc. (ounces)

0.486

0.025

Sugar (ounces)

0.280

-0.099

Choc
Choc. (ounces)

Sugar (ounces)

1
0.409

There is a high correlation between chocolate and sugar


(>0.4) hence we should eliminate one of these variables
- suga
sugar because of
o thee low
ow t-statistic.
s a s c.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

99

Heter
Hetero
oscedasticity:

Residuals

Cream Residual Plot


0

C
Cream
(ounces)
(
)

Residuals

Chocolate Residual Plot


0

Chocolate (ounces)

There appears tto bbe no hheteroscedasticity


Th
t
d ti it
December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

100

Reesidual Valuee

Autocorrelation:
Residuals vs. Sample
p Number
0

Sample Number

There appears to be no autocorrelation


Residual Distribution:

Frequenncy

Residual Frequency

Residual
December 15 2007

DMD Fall 07 Final Review


The residuals appear to be normally
distributed
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

101

TOPIC 8:
Linear Optimization
Optimizatio
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

102

Different Modes of Driving Example


Different Modes of Driving manufacturer of cars & trucks.

Vehicles are processed in the paint and body shops.


Painting trucks takes 1.5 times as much time as painting cars. If the paint shop only
paints trucks, then it paints 40 trucks/day. If it only paints cars, then 60 cars/day.
Body work on cars and trucks takes the same amount of time.
time If the body shop only
produces trucks, then 50/day. If only produces cars, then 50/day.
Trucks contribute $500 and cars contribute $400 to profit.
Determine daily production schedule to maximize profits.
Decision Variables : C=# cars, T=# trucks
Objective Function: Max 400 C+ 500 T
Constraints:

Paint Shop: T/40+C/60 <=1 day


Bodyy Shop:
p T/50+C/50 <=1 day
y

T,C >=0 vehicles


December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

103

Trucks

Different Modes of Driving Example...


Which are the Binding Constraints?
Optimal
O
ti l Solution
S l ti

50

T/50+C/50 <=1

40

20

Total Profit: $$22,000/Day


,
y
Cars 30, Trucks 20
Profit: $10,000/Day

Feasible

Profit: $5,000/Day
$5 000/Day

Region

T/40+C/60 <=1

T >=0

0
December 15
2007
C>=0

25

50
DMD Fall 07 Final Review

60

Cars

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

104

Different Modes of Driving Example...


Adjustable Cells
Cells

Final Reduced Objective Allowable Allowable


Cell
Name
Value
Cost
Coefficient Increase
Decrease
$B$2 Cars Decision Variables
30
0
400
100 66.66666667
$B$3 Trucks Decision Variables
20
0
500
100
100
Constraints
Cell
Name
$B$7 Paint Shop Constraints
$B$8 Body Shop Constraints

December 15 2007

Final Shadow Constraint Allowable Allowable


Value Price
R.H. Side Increase
Decrease
1
12000
1
0.25 0.166666667
1
10000
1
0.2
0.2

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

105

p
Economic Interpretation
An outside contractor offers to paint 8 more trucks (or 12 more cars)
per day for $2,000. Should we accept the offer?
Yes, based on the shadow prices, this expansion is worth:
$12,000 * 8/40 - $2,000 = $400
and, the increased capacity of 8/40 or 0.2 is within the allowable increase.

If the DMD company was given extra labor to increase productivity


in the body shop by 5 cars (or trucks),
trucks) what would DMDs profits become?
become
Increased profit is $10,000 * 5/50 = $1,000
and, the increased capacity of 5/50 or 0.1 is within the allowable increase.

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

106

Value of Opt. Obj

Careful of the range!

25 K
22 K
20 K

Sl
Slope
=0
Slope = 12,000

Slope = 24,000

0K

0.83

1.00

1.25

Value of Paint Shop


RHS

In this range, every unit change in the RHS results in a $12,000 unit
change in the objective function
function.
December
2007
DMD price
Fall 07 Final
Review
107
This
value15is
called the shadow
of the
constraint over this range.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

TOPIC 9:
Nonlinear Optimization
Optimizatio
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

108

A Production Example
You are producing three products A, B and C. You need to satisfy production

limits and resource availability constraints: (1) You can produce at most 1000,
800 and 700 units of A, B and C respectively; (2) The data for resource
availability
il bilit is
i as follows:
f ll
A

Resources (in hours)

machine 1

35
3.5

1000

machine 2

0.2

0.8

1.2

350

Production levels influence market price of each product:


PA=200 - XA + 0.5 XB,

December 15 2007

PB =100 - 2XB + 0.25 XA, PC = 500 - XC

We want to Maximize revenue


DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

109

Formulation
Decision Variables

X1A = product A to be produced by machine 1

X1B = product B to be produced by machine 1

X1C= product C to be produced by machine 1

X2A = product A to be produced by machine 2

X2B = product B to be produced by machine 2

X2C = p
product C to be p
produced by
y machine 2

Objective Function
Max PA * (X1A + X2A) + PB * (X1B + X2B) + PC * (X1C + X2C)

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

110

More Formulation...
Subject to:

Price:

Resource:

PA= 200 - (X1A + X2A) + 0.5 * (X1B + X2B),


PB = 100 - 2 * (X1B + X2B) +0.25 * (X1A + X2A) ,
PC = 500 - (X1C + X2C)

Machine 1: 2 X1A + X1B + 3.5 X1C <= 1000

Machine 2: 0.2 X2A + 0.8 X2B + 1.2 X2C <= 350

Production Limit:

Non-negativity: X1A ,X2A ,X1B ,X2B ,X1C ,X2C , PA , PB , PC >= 0

December 15 2007

A: X1A + X2A <= 1000


B: X1B + X2B <= 800
C: X1C + X2C <= 700

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

111

Excel Solution
hours/unit
B

A
Machine 1
Machine 2

2
0.2

1
0.8

3.5
1.2

Capacity Limit

1000
units

800
units

700
units

Decision Variables:
(units)

58.82
58
82
58.82

23.53
23
53
23.53

125.00
125
00
125.00

Price A
Price B
Price C
Objective Function:

Constraints:
machine 1 capacity
machine 2 capacity
product A limit
product B limit
Decemberproduct
15 2007 C limit

Machine Limit
1000 hours
350 hours

$ 105.88
$ 35.29
$ 250.00
$ 76,617.65 MAX
maximize revenues
LHS
RHS
578.68
<=
1000
180.59
<=
350
117 65
117.65
<=
1000
47.06
<=
800
250.00
DMD Fall <=
07 Final Review700

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

112

Sensitivity Report
Report

Microsoft
Mi
ft E
Excell 10
10.0
0S
Sensitivity
iti it R
Reportt
Worksheet: [Book1]Products
Report Created: 12/9/2004 10:19:06 PM

Adjustable Cells
Cell
$B$8
$C$8
$D$8
$B$9
$C$9
$D$9

Name
units
units
i
units
units
units
units

Final
Reduced
Value
Gradient
58.82
0

23.53
23
53
0
125.00
0

58.82
0

23.53
0

125.00
0

All Lagrange
Multipliers are zero!
All constraints are non
binding around the close
proximity of the optimal
solution.

Constraints

Cell
Name
$B$19 machine 1 capacity LHS
$B$20 machine 2 capacity LHS
$B$21 product A limit LHS
$B$22 product B limit LHS
$B$23 product C limit LHS
$B$11 Price A units
$B$12 Price B units
December 15 2007
$B$13 Price C units

Final
Lagrange
Value
Multiplier
578.68
0

180.59
0

117.65
0

47.06
0

250.00
250
00
0
$
105.88 $

$
35.29 $

DMD Fall 07 Final Review


$
250.00 $

Optimal solution occurs


in the interior of the
feasible region.

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

113

TOPIC 10:
Discrete Optimization
Optimizatio
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

114

Lets play!
An electrical utility company each day is deciding which generators to start up.

It has three generators (see below).


There are two periods in a day,
day and the number of megawatts needed in the
first period is 2900. The second period requires 3900 megawatts. Unused
electricity left over from period 1 can be used in period 2.
It wants to minimize total cost.
Formulate and solve as MIP !
Generator

Fixed costs
per period ($)

Cost per period

Max capacity

per megawatt used ($)

in each period (MW)

3000

2100

2000

1800

1000

3000

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

115

TOPIC 10:
Discrete Optimization
Optimizatio
Optimization

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

114

More Formulation...
Objective Function
minimize 3000 (XA1 + XA2) + 2000 (XB1 + XB2) + 1000 (XC1 + XC2)
5 ((YA1 + YA2) + 4 ((YB1 + YB2) + 7 ((YC1 + YC2)

Subject to:

Capacity:

Machine A: YA1 <= 2100 XA1 ; YA2 <= 2100 XA2

Machine B: YB1 <= 1800 XB1 ; YB2 <= 1800 XB2

Machine C: YC1 <= 3000 XC1 ; YC2 <= 3000 XB2

Demand:

Period 1: YA1 + YB1 + YC1 >= 2900

Period 2: YA2 + YB2 + YC2 +YA1 + YB1 + YC1 - 2900 >= 3900

Binary: XA1, XA2, XB1, XB2, XC1, XC2 = {1 if used; 0 otherwise}

Non-negativity:
YA1, YA2, YB1, YB2, YC1, YC2 >= 0

December 15 2007

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.

MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

117

Excel Solution
Objective Function:

Fixed
10000

Variable
30400

A
3000
5

B
2000
4

C
1000
7

Total
40400

Cost:
Fixed Cost per period
Cost per megawatt

Period 2 Period
d1

Decision Variables and Constraints:


Xij (0 or 1)
1
1
0
1
1
0

A
B
C
A
B
C

Other constraints:
Xij
Xij, Yij

binary
>=

December 15 2007

Yij
2100
1800
0
1100
1800
0

<=
<=
<=
<=
<=
<=

Limit * Xij
2100
1800
0
2100
1800
0

Limit
2100
1800
3000
2100
1800
3000

Total

Demand

3900

2900 + 1000

3900

>=

2900

>=

3900

DMD Fall 07 Final Review

Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

118

You might also like