You are on page 1of 23

SEHH107 1

COMPUTATIONAL
TOOLS FOR
STATISTICS
R A N D O M VA R I A B L E S A N D P R O B A B I L I T Y D I S T R I B U T I O N
OUTLINE

• Random Variables

• Discrete Probability Distribution


– Expectation and Variance of Discrete Random Variables
– Binomial Distribution

• Continuous Probability Distribution


– Uniform Distribution
RANDOM VARIABLES
• Consider the answer of the following two cases, which one will generate
random variables? Crystal’s case

– Crystal is too addicted to the new drama series and does not do any study. Because of
this, she decided to 裸考 and will determine the answer of the MC questions by tossing
the eraser.

– Gigi is a hard working student and she has revise every single points covered in the notes.
She can answer the questions freely without any pressure.
RANDOM VARIABLES

• Definition:
– Assigning numerical value to each of the outcomes of a random events

• Statistical Experiment
– Two key ideas
• Definite outcomes
• Random events
DISCRETE RANDOM VARIABLE

• With countable list of distinct values

• Usually obtained by counting

• e.g.Visit a restaurant which opens randomly for 5 times


– Let X be the number of times that the restaurant opens
• (X = 0, 1, 2, 3, 4, 5)
DISCRETE RANDOM VARIABLE

• 𝑃 𝑋 = 𝑥𝑖 = 𝑝𝑖 = 𝑃(𝑥𝑖 ) is the probability that the random variable 𝑋 is having the value 𝑥𝑖

• Two conditions:
– 𝑝𝑖 must be between 0 and 1
– The sum of all individual 𝑝𝑖 must be equals to 1

• Probability distribution function/Probability mass function (pmf)


– Table or rules which shows the values of the discrete variable and the corresponding probabilities

𝑥𝑖 𝑥1 𝑥2 𝑥3 …
𝑃 𝑋 = 𝑥𝑖 𝑝1 𝑝2 𝑝3 …
DISCRETE RANDOM VARIABLE

• Cumulative probability distribution


– table or rules that shows the probability of 𝑃 𝑋 ≤ 𝑥 with 𝑥 equals to any real number
– It can also be denoted as 𝐹(𝑥)

– The following table shows the pmf of the number of wins in 3 matches of 傳說對決 for a group of players
𝑥𝑖 0 1 2 3
𝑃 𝑋 = 𝑥𝑖 0.5 0.1 0.3 0.1

– Find the 𝑃 𝑋 = 1 =0.1


– Find the 𝑃 𝑋 > 2 = 0.1
– Graph the Cumulative probability distribution of the above case
DISCRETE RANDOM VARIABLE

• If the previous example is calculated from the results of 100 players. What will be the mean
number of wins? What is the standard deviation for the number of wins?

• 𝐸 𝑋 = 𝜇 = ∑𝑥𝑖 𝑝𝑖 =0(0.5) +1(0.1)+2(0.3)+3(0.1)=1

• 𝑉𝑎𝑟 𝑋 = 𝜎 2 = ∑ (𝑥𝑖 − 𝜇)2 𝑝𝑖 = (0 − 1)2 0.5 + (1 − 1)2 0.1 + (2 − 1)2 0.3 + (3 − 1)2 0.1=1.2
DISCRETE RANDOM VARIABLE

• If Z = X + Y, 𝐸 𝑍 = 𝐸 𝑋 + 𝑌 is actually 𝐸 𝑋 + 𝐸 𝑌 , this is the linearity of the random


variables
• If a is a constant, 𝐸 𝑎 = a and 𝑉𝑎𝑟 𝑎 = 0
• If k is another constant 𝐸 𝑘𝑋 = 𝑘𝐸 𝑋 𝑉𝑎𝑟 𝑘𝑋 = 𝑘 2 𝑉𝑎𝑟 𝑋

• As 𝐸 𝑋 = 𝜇 and show that 𝑉𝑎𝑟 𝑋 is actually 𝐸 𝑋 2 − 𝜇2


• 𝐸[𝑋 − 𝜇]2 = 𝐸 𝑋 2 − 2𝐸 𝑋 𝜇 + 𝜇2 = 𝐸 𝑋 2 − 2𝜇2 + 𝜇2 = 𝐸 𝑋 2 − 𝜇2
DISCRETE RANDOM VARIABLE

• If the profit function of an ig shop is as follow: 10000 + 150X where 10000 is the constant
profit from advertising and X is a random variable representing the number of product sold by
the shop. Given that the mean value of X is 200 with a standard deviation of 5. What is the
mean and standard deviation profit for the shop?

• Mean = 𝐸[10000 + 150𝑋]= 10000 + 150𝐸 𝑋 = 10000 + 150 200 = 40000


• Variance = Var[10000 + 150𝑋]= 1502 𝑉𝑎𝑟 𝑋 = 1502 25 = 562500
• Standard deviation = 562500 =750
DISCRETE RANDOM VARIABLES

• If we further the analysis to a list of random variables:

• The expected value of the sum of the random variables

• 𝐸 𝑋1 + 𝑋2 + 𝑋3 + ⋯ + 𝑋𝑁 = 𝜇1 + 𝜇2 + 𝜇3 + ⋯ +𝜇𝑁

• The variance of the sum of the random variables

• Var 𝑋1 + 𝑋2 + 𝑋3 + ⋯ + 𝑋𝑁 = 𝜎12 + 𝜎22 + 𝜎32 + ⋯+ 𝜎𝑁2

When all the variables are independent


DISCRETE RANDOM VARIABLES

• Gigi and Crystal are decorating their office. The amount spend by Gigi and Crystal is calculated
by using the function 500+5X and 550+8Y respectively. X and Y are independent random
variables depending on the number of tasks that need to be completed. The mean value of X is
5 with a variance of 0.25 while Y has a mean of 3 with the standard deviation of 0.09. Find the
mean and variance for the total amount spend to decorate the office.

• Mean = 𝐸[(500 + 5X) + (550 + 8Y)]=𝐸[(500 + 5X)] + 𝐸[(550 + 8Y)] = 500 + 5 5 +


550 + 8(3) = 1099
• Variance = 𝑉𝑎𝑟[(500 + 5X) + (550 + 8Y)]=𝑉𝑎𝑟[(500 + 5X)] + 𝑉𝑎𝑟[(550 + 8Y)] =
25 0.25 + 64(0.092 ) = 6.7684
DISCRETE RANDOM VARIABLES
• When you study the variables together, you are looking at the joint probability of the random variables

• 𝑃 𝑋 = 𝑋𝑗 , 𝑌 = 𝑌𝑘 denotes the joint probability when 𝑋 = 𝑋𝑗 𝑎𝑛𝑑 𝑌 = 𝑌𝑘


– If the variables are independent, 𝑃 𝑋 = 𝑋𝑗 , 𝑌 = 𝑌𝑘 = 𝑃 𝑋 = 𝑋𝑗 )𝑃( 𝑌 = 𝑌𝑘

• 𝐹 𝑋𝑗 , 𝑌𝑘 denotes the cumulative probability function when when 𝑋 ≤ 𝑋𝑗 𝑎𝑛𝑑 𝑌 ≤ 𝑌𝑘

• When we look a single variable disregard the value of others in a joint probability distribution , it is the
marginal probability

– 𝑃 𝑋 = 𝑋𝑗 = ∑𝑘 𝑝𝑗𝑘
• Conditional probability
𝑃 𝑋=𝑋𝑗 ,𝑌=𝑌𝑘
– 𝑃 𝑌𝑘 𝑋𝑗 =
𝑃 𝑋=𝑋𝑗
DISCRETE RANDOM VARIABLES
X=1 X=2 X=3 Total
Y=1 0.2 0.1 0.05 0.35
Y=2 0.15 0.2 0.1 0.45
Y=3 0.05 0.1 0.05 0.2
Total 0.4 0.4 0.2 1

The number of dogs (X) and cats (Y) visiting the Luxury Pet Hotel are random and shown in the above table.

• What is the probability that there are 3 dogs and 2 cats visiting the Hotel? 0.1

• What is the probability that there are 3 dogs visiting the Hotel? 0.2

• What is the probability that there are 3 dogs visiting the Hotel given that there are 2 cats in the Hotel
already? 0.1/0.45 = 0.2222
BINOMIAL RANDOM VARIABLE

• Binomial Experiment
– n independent trails
– 2 possible outcomes
• Success and failure
– Trials are independent
– Probability of success are constant (p)
• Probability of failure is (1-p)

– Binomial random variable (X): number of success in n independent binomial experiment


BINOMIAL RANDOM VARIABLE

• k success in n independent binomial experiment:

𝑃 𝑋 = 𝑘 = 𝐶𝑘𝑛 𝑝𝑘 (1 − 𝑝) 𝑛−𝑘

• Where
• n is the number of t
• K is the number of success
• p is the probability of success
BINOMIAL RANDOM VARIABLE

𝑷(𝑿 ≤ 𝟓) 𝑷(𝑿 < 𝟓) 𝑷(𝑿 ≥ 𝟓) 𝑷(𝑿 > 𝟓)

Smaller/fewer than or Smaller/fewer than 5 Larger than or equals to Larger than 5


equals to 5 5
No more than 5 At least 5

At most 5 Not less than 5

5 or more
BINOMIAL RANDOM VARIABLE
• Crystal is addicted to the product with giraffe. The probability that she buys the product
immediately is 0.7. In 10 random encounters of the giraffe product, find the following probability:
• X~ B( 10 , 0.7 )
– that she buys the product immediately for exactly 2 encounters.

– 𝑃 𝑋 = 2 = 𝐶210 0.72 (1 − 0.7) 10−2 =0.0014

– that she buys the product immediately for at least 2 encounters.


– 1-𝑃 𝑋 = 0 − 𝑃 𝑋 = 1 = 1 − 𝐶010 0.70 (1 − 0.7) 10−0 −𝐶110 0.71 (1 − 0.7) 10−1 =0.9999

– that she does not buy the product immediately for at least 2 encounters.
Y~ B( 10 , 0.3 )
1-𝑃 𝑌 = 0 − 𝑃 𝑌 = 1 = 1 − 𝐶010 0.30 (1 − 0.3) 10−0 −𝐶110 0.31 (1 − 0.3) 10−1 =0.8507
BINOMIAL RANDOM VARIABLE

• For a binomial rando variable

– 𝜇 = 𝑛𝑝 =10(0.7) = 7

– 𝜎 2 = npq = 10(0.7)(0.3) =2.1

– Find the value of 𝜇 and 𝜎 2 for the case on p.17


CONTINUOUS RANDOM VARIABLE
• Take any value in an interval or group of intervals

• Usually obtained from measurement

• Smoothing the curve of the Histogram - Probability density function (pdf) is a curve with
probability properties


– ‫׬‬−∞ 𝑓 𝑥 𝑑𝑥=1
– Lies on or above the x axis
CONTINUOUS RANDOM VARIABLE

• Probability can only be obtained for values within an interval


– Area under the curve
– Integrating the pdf between the two values of the interval
– 𝐹 𝑥 = 𝑃(𝑋 ≤ 𝑥)
– Cumulative density function (cdf)
UNIFORM DISTRIBUTION

• Flat pdf with the range, usually denoted by a, b

• X~U( a , b )
1
• In other word, 𝑓 𝑥 = ൝ 𝑏−𝑎 (𝑎 ≤ 𝑥 ≤ 𝑏)
0

𝑎+𝑏 2 (𝑏−𝑎)2
• 𝜇= 𝜎 =
2 12 a b
UNIFORM DISTRIBUTION

• If the length of a randomly selected fish is uniformly distributed in between (4 to 11) cm, Let X be the length of
the selected fish, answer the following:

• X~U( 4 , 11 )

1
• 𝑃(𝑋 ≤ 5) = 5 − 4 =0.1429
11−4

1
• 𝑃(𝑋 > 8)= 11−4 11 − 8 =0.4286

4+11 (11−4)2
• 𝐸(𝑋) = 7.5 and Var(X) = =4.0833
2 12

You might also like