You are on page 1of 21

MSBA7003 Quantitative Analysis Methods

Tutorial 01
Haobo Yu
2021 – 2022
Email: hbyu@hku.hk Office: Room 808, KKL WeChat ID: yhb_Ber

1
2

Class notes

• Remember to start forming your final project group


• Teammates should come from the same subclass.
• Please submit your group information through the template.

• The template of answer sheet

• Grading policy of assignment


• The correct answer may consist of one or more options.
• Full remarks for the correct answer and zero otherwise.
3

Tutorial

• Explanation of assignments

• More exercises
• Similar with in-class exercises and assignments

• Software
• Python is most used
• R, Excel
4

Agenda

• Joint distribution
• Corpus Data

• Bayesian learning and decision making


• Café du Donut

• An Explanation of a homework
• Thompson Lumber Company

• Coding for Bayesian Update


• Toss coins in Python
5

Corpus Data

• Assume there is a corpus of a 100 words (a corpus is a collection of


text). We tabulate the words, their frequencies and probabilities in
the corpus are as follows:
Words (w) Occurrences Probabilities Length Vowels
c(w) P(w) (x) (y)

the 30 0.30 3 1

to 16 0.16 2 1

some 15 0.15 4 2

grade 10 0.10 5 2

point 9 0.09 5 2
fail 8 0.08 4 2
pass 8 0.08 4 1
HK 4 0.04 2 0
6

Corpus Data

• We define the following


random variables: Words (w) Occurrences Probabilities Length Vowels
• X: the length of the word; c(w) P(w) (x) (y)
• Y: number of vowels in the word. the 30 0.30 3 1

to 16 0.16 2 1
• The probabilities of some
some 15 0.15 4 2
events:
• 𝑃 2≤𝑋≤3 = grade 10 0.10 5 2
𝑃 to +𝑃 HK + 𝑃 the = point 9 0.09 5 2
0.16 + 0.3 + 0.04 = 0.5
fail 8 0.08 4 2
• 𝑃 2 ≤ 𝑌 = 𝑃 some +
pass 8 0.08 4 1
𝑃 grade + 𝑃 point +
𝑃 fail = 0.42 HK 4 0.04 2 0
7

Joint Distribution
• We can describe the joint distribution between word length (𝑋)
and number of vowels (𝑌):
• Let 𝑓 𝑥, 𝑦 = 𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 .
• Examples:
• 𝑓 4,2 = 𝑃 fail + 𝑃 some = 0.15 + 0.08 = 0.23;
• 𝑓 3,1 = 𝑃 the = 0.3;
• 𝑓 5,0 = 0.
Words (w) Occurrences Probabilities Length Vowels Joint y
c(w) P(w) (x) (y) Distribution
0 1 2
the 30 0.30 3 1
2 0.04 0.16 0
to 16 0.16 2 1
x 3 0 0.30 0
some 15 0.15 4 2
4 0 0.08 0.23
grade 10 0.10 5 2
5 0 0 0.19
point 9 0.09 5 2
fail 8 0.08 4 2
pass 8 0.08 4 1
HK 4 0.04 2 0
8

Corpus Data

• According to the joint distribution 𝑓(𝑥, 𝑦), we can calculate the


marginal distribution (𝑓𝑋 and 𝑓𝑌 ) and the conditional distribution.

y
𝑓𝑋
0 1 2

2 0.04 0.16 0 0.2

x 3 0 0.30 0 0.3

4 0 0.08 0.23 0.31

5 0 0 0.19 0.19

𝑓𝑌 0.04 0.54 0.42

Y|X=2 0.2 0.8 0


9

Independency between two (discrete) RVs


• Finally, based on joint distribution 𝑓(𝑥, 𝑦) and the marginal
distribution, we can examine whether the length of the word (𝑋) and
number of vowels in the word (𝑌) are independent.

• Recall the definition of independency of two events: 𝑃 𝐴𝐵 =


𝑃 𝐴 𝑃(𝐵).

• Independency of two (discrete) RVs: X and Y.


• Independency of two sets of basic events.
• The basic events cannot be further divided. For example, the value of X can
be 2, 3, 4 or 5 and basic events are X=2, X=3, X=4 and X =5. For Y, basic events
are Y=0, Y=1 and Y=2.
• 𝑃 𝑋 = 𝑖, 𝑌 = 𝑗 = 𝑃 𝑋 = 𝑖 ∗ 𝑃 𝑌 = 𝑗 for any 𝑖 in 2,3,4,5 and any 𝑗 in
0,1,2 .
10

Independency between two variables

Joint y y
𝑓𝑋 ∗ 𝑓𝑌
distribution 0 1 2 𝑓𝑋 0 1 2 𝑓𝑋

2 0.04 0.16 0 0.2 2 0.008 0.108 0.084 0.2

3 0 0.30 0 0.3 3 0.012 0.162 0.126 0.3

4 0 0.08 0.23 0.31 x 4 0.0124 0.1674 0.1302 0.31


x
5 0 0 0.19 0.19 5 0.076 0.1026 0.0798 0.19

𝑓𝑌 0.04 0.54 0.42 𝑓𝑌 0.04 0.54 0.42


11

Café du Donut

• The Café buys donuts each day for $40 per carton of 20 dozen
donuts. Any cartons not sold are thrown away at the end of the
day. If a carton is sold, the revenue is $60.

• Different from the case in Session 2, the salesperson is faced with


two kinds of demand situations and needs to decide order size
each day.

• Suppose that the order size (Q) can only be either 6 or 7 due to
the storage capacity and delivery capacity.
12

Café du Donut
• The salesperson’s initial belief is that two demand situations are
equally likely.
DAILY DEMAND PROBABILITY PROBABILITY MARGINAL
(CARTONS) UNDER LOW UNDER HIGH PROBABILITY
DEMAND DEMAND
4 0.25 0.05 0.15

5 0.20 0.10 0.15

6 0.15 0.10 0.125

7 0.15 0.15 0.15

8 0.10 0.15 0.125

9 0.10 0.20 0.15

10 0.05 0.25 0.15

• On the first day, should the order size be 6 or 7?


13

Café du Donut

• Monetary Payoff Table

D=4 D=5 D=6 D=7 D=8 D=9 D = 10 EMV

Q=6 0 60 120 120 120 120 120 93

Q=7 -40 20 80 140 140 140 140 87.5

Prob. 0.15 0.15 0.125 0.15 0.125 0.15 0.15

• If the salesperson finds that the demand in first day is 8, should he


increase the order size from 6 to 7?
• Here, we assume that the distribution of the demand situation on the second day
is the same as the distribution of the situation on the first day.
14

Café du Donut

• The salesperson updates his belief:


Low Demand High Demand Marginal

𝐷1 = 8 0.1*0.5 0.15*0.5 1/8

Updated belief 2/5 3/5

D=4 D=5 D=6 D=7 D=8 D=9 D = 10 EMV

Q=6 0 60 120 120 120 120 120 96

Q=7 -40 20 80 140 140 140 140 92.6

Prob. 0.13 0.14 0.12 0.15 0.13 0.16 0.17

• The order size should not be increased.


15

Thompson Lumber Company

• The profit of each decision under each state of market remains the
same as the one in the slides of Session 2. The market also can be
favorable or unfavorable. Now Thompson Lumber Company is
deciding whether to hire a consulting company to give a prediction of
the market.
• We have a prior belief that P(Fav) = 0.8.
STATE OF NATURE

FAVORABLE MARKET UNFAVORABLE MARKET


ALTERNATIVE (profit in $) (profit in $)

Construct a large plant 200,000 –180,000


Construct a small plant 100,000 –20,000
Do nothing 0 0
Probability 0.8 0.2
16

Thompson Lumber Company

• The historical data from ABC, Inc. is as follows.

Positive Negative Total

Favorable 35 20 55

Unfavorable 20 25 45

Total 55 45

• Then P(Pos)=? P(Neg)=? P(Fav|Pos)=? P(Unf|Neg)=?


• Should the company hire ABC to conduct the survey?
17

Thompson Lumber Company


• When ABC is not hired,
A State-of-Nature Node Payoffs
Favorable Market (0.8)
$200,000
EMV1 = 124k 1
Unfavorable Market (0.2)
–$180,000

Favorable Market (0.8)


EMV2 = 76k $100,000
Construct a
2
Small Plant Unfavorable Market (0.2)
–$20,000

$0
Without survey: EMV= 124k
18

Thompson Lumber Company


• The belief is updated based on the consulting report:
Joint Prob. Positive Negative Marginal

Favorable 0.8*35/55 0.8*20/55 0.8

Unfavorable 0.2*20/45 0.2*25/45 0.2

Marginal 28/55+4/45 16/55+1/9

• P(Pos) = 28/55+4/45= 0.598


• P(Neg) = 16/55+1/9 = 0.402
• P(Fav|Pos) = 28/55/(28/55+4/45) = 0.8514
• P(Unf|Neg) = 1/9/(16/55+1/9) = 0.2764
19

Thompson Lumber Company

• Decision Tree with Sample Information


Second Decision Payoffs
Point
$133,000 Favorable Market (0.85)
$190,000
If a consulting survey is provided: 2 Unfavorable Market (0.15)
–$190,000

$133,000
EMV = 113k Small
$72,000 Favorable Market (0.85)
$90,000
Plant
3 Unfavorable Market (0.15)
–$30,000
No Plant
–$10,000
1 $83,600 Favorable Market (0.72)
$190,000
4 Unfavorable Market (0.28)
–$190,000
$56,400
$83,600
Favorable Market (0.72)
Small $90,000
Plant
5 Unfavorable Market (0.28)
–$30,000
No Plant
–$10,000

• Since 𝐸𝑀𝑉𝑤𝑖𝑡ℎ𝑠𝑢𝑟𝑣𝑒𝑦 (113𝑘) < 𝐸𝑀𝑉𝑤𝑖𝑡ℎ𝑜𝑢𝑡𝑠𝑢𝑟𝑣𝑒𝑦 (124𝑘), Thompson


Lumber Company should not hire ABC, Inc to conduct a market survey.
20

Thompson Lumber Company

• Why the decision is different?


Joint Prob. Positive Negative Marginal

Favorable 0.78*0.353 0.27*0.647 0.45

Unfavorable 0.22*0.353 0.73*0.647 0.55

Marginal 0.353 0.647

Joint Prob. Positive Negative Marginal

Favorable 0.8*35/55 0.8*20/55 0.8

Unfavorable 0.2*20/45 0.2*25/45 0.2

Marginal 0.598 0.402

• Error rate: P(Unf, Pos)+P(Fav, Neg) = P(Pos|Unf)P(Unf)+ P(Neg |


Fav)P(Fav)
• Top table: 0.27*0.647+ 0.22*0.353=0.25235
• Bottom table:20/45* 0.2+ 20/55*0.8=0.38
21

Tossing Coins

• What is the probability of getting a head?


• Suppose there are three possible cases: 1/3, 1/2, and 2/3.

• What if we can toss the coin many times?


• Please refer to the “TossCoin.py”.

You might also like