You are on page 1of 16

Continuous Distribution

MAT 3103: Computational statistics and Probability


Chapter 6: Continuous Distribution

Continuous probability distribution:


It is the probability distribution of a continuous random variable. A probability distribution in
which the random variable X can take on any value (is continuous). Because there are infinite
values that X could assume, the probability of X taking on any one specific value is zero.
Most of the data science or machine learning skills are based on certain assumptions
about the probability distributions of the data. Continuous distributions allow a data scientist
to recognize the patterns in continuous random variables in many ways.
Under continuous probability distribution, we will learn,
(a) Normal distribution,
(b) Exponential distribution and
(c) Rayleigh distribution.

Normal distribution:
Everything in life is a trade-off. Some of us are born with high aptitudes for academic learning,
some are born with great physical skills, some are athletic, some are artistic and so on. In terms
of skills and talents, humans are wildly diversified. What we end up accomplishing in life
ultimately depends on our practice and effort, but we are all born with different aptitudes and
potentials.

1
Continuous Distribution

Above is a normal curve which is quite simple. Let us consider the population of students in
AIUB. The horizontal axis represents how good they are at Math. Further to the right means
they’re really good, further to the left means they’re really bad. Now, notice that it gets really
thin at the far ends of the curve which means there are a few students who are really, really
good at Math, and a few students who are really, really bad. The majority of the students fall
into the mediocre middle. We can apply a “normal curve” in this way to tons of things in a
population in real life like height, weight, emotional maturity and so on.

Probability distribution of normal distribution:


Mathematically, the distribution of a normal variable X with mean 𝜇 and variance 𝜎 2 , denoted
by, X~N (𝜇, 𝜎 2 ), is defined as:
1 𝑥− 𝜇 2
1
f(x) = 𝜎√2𝜋 𝑒 − 2( )
𝜎 ; −∞ ≤ 𝑥 ≤ ∞ .

Characteristics of normal distribution:


a. The normal curve is bell-shaped as shown below-

b. Mean, median and mode of normal distribution coincide.


c. The normal curve is symmetric and mesokurtic.
𝑍−𝜇
d. If X~N( 𝜇, 𝜎 2 ), then Z = ~ N( 0, 1). This Z is called standard normal variable.
𝜎

e. If 95% values of any variable fall in the limit 𝑥̅ ± 2𝜎 (i.e. 𝑥̅ − 2𝜎 to 𝑥̅ +2𝜎), then this
variable is called normal variable, and follows normal distribution.

2
Continuous Distribution

Example 6.1: The life length (in days) of electric bulb follows N (400, 5000). Find the
probability that the life length of a randomly selected bulb is (i) less than 450 days, (ii) between
300 to 500 days, and (iii) more than 350 days. [N.B. Z – table is to be used to calculate the
probabilities]
Solution: Let, x be the life length (in days) of electric bulb.
Since, X ~ N (400, 5000); we have, 𝜇 = 400, σ2 = 5000 ∴ σ = √5000 = 70.71.

𝑥−𝜇 450−400
i) P (X < 450) = P ( < ) = P (Z < 0.71) = 0.7611.
𝜎 70.71
300−400 𝑥−𝜇 500−400
ii) P (300 < X < 500) = P ( < < ) = P (−1.41 < Z < 1.41)
70.71 𝜎 70.71

= P (Z < 1.41) − P (Z < −1.41) = 0.9207 – 0.0793 = 0.8414.


𝑥−𝜇 350−400
iii) P (X > 350) = P ( > ) = P (Z > −0.71) = 1− P (Z ≤ −0.71)
𝜎 70.71

= 1− 0.2389 = 0.7611.

Example 6.2: If the birth weight in a population are normally distributed with mean of 109 oz
and a standard deviation of 13 oz. What is the chance of obtaining a birth weight of i) 141oz
or havier ii) less than 100 oz?

Solution: Let, x be the birth weight (oz).


Since, we have, 𝜇 = 109, σ = 13.
𝑥−𝜇 141−109
i) P (X > 141) = P ( > ) = P (Z > 2.46) = 1− P (Z ≤ 2.46) = 1− 0.9931 = 0.0069.
𝜎 13
𝑥−𝜇 100−109
ii) P (X < 100) = P ( < ) = P (Z < − .69 ) = 0.2451.
𝜎 13

MATLAB code
To compute the pdf of the Normal distribution with mean mu (μ) and standard deviation sigma
(σ).
y = normpdf(x,mu,sigma)

3
Continuous Distribution

Table of Standard Normal Probabilities for z-scores. Table entries are "less than" areas.

4
Continuous Distribution

5
Continuous Distribution

Exponential distribution:
Exponential distribution is concerned with the amount of time until some specific event occurs.
There are fewer large values and more small values. The values of an exponential
random variable occur in this way. For example, the amount of money customers spend in a
trip to the supermarket follows an exponential distribution. There are more people who spend
small amounts of money and fewer people who spend large amounts of money.
Mathematically, the distribution of an exponential variable X is defined as:
𝑥
1
f(x) = λ 𝑒 − λ ; x > 0, λ > 0.

Here, X = service time / length of time / waiting time,


𝜆 = average of service time / length of time / waiting time. Also, the variance is 𝜆2.

Other forms of exponential distribution are


a. f(x) = e−𝑥 ; x >0, 𝜆 =1
1
b. f(x) = 𝜆e−𝜆x ; x >0, 𝜆 = 𝜆

Applications in real life:


Using exponential distribution, we can answer some of the questions in real life scenario as
below:
Waiting time modeling:
I am leaving for AIUB from my house in Uttara on a day. The bus comes in every 15 minutes
on average and I just missed the bus!!! The driver was so mean!!! The moment I arrived, his
helper closed the door and he left. If the next bus doesn’t arrive in next 10 minutes, I must call
Uber or else I’ll be late. What’s the probability that it’ll take less than 10 minutes for the next
bus to arrive?

6
Continuous Distribution

Reliability modeling:
Since we can model the successful event (the arrival of the bus), why not the failure modeling
— the amount of time a product last? The number of hours that Amazon Web Services (AWS)
hardware can run before it needs a restart is exponentially distributed with an average of 8000
hours. You don’t have a backup server and you need an uninterrupted 10000-hour run. What
is the probability that you will be able to complete the run without having to restart the server?
What is the probability that the server doesn’t require a restart between 12 months and 18
months?

Formulas to calculate probabilities:


𝑥 𝑥 𝑥 𝑥
∝ 1 ∝ 1
P ( X > x) = ∫𝑥 𝑓(𝑥) dx = 𝜆 ∫𝑥 𝑒 −𝜆 dx = [ 𝜆 𝜆 𝑒 − 𝜆 ]∝𝑥 = −(0−𝑒 − 𝜆 ) = 𝑒 − 𝜆 .
𝑥
P (X < x) = 1− 𝑒 − 𝜆 .
𝑥1 𝑥2
P (𝑥1 < X < 𝑥2 ) = 𝑒 − 𝜆 − 𝑒− 𝜆 .

Properties:
• If the number of events per unit time follows a Poisson distribution, then the amount
of time between events follows the exponential distribution. If you get 3 customers
per hour, it means you get one customer every 1/3 hour.
• Exponential distribution is also called the distribution of ‘Lack of Memory’, since
before the occurrence of the event the experimenter may need to wait for some time
which is not counted for the occurrence of the event (queue time).

MATLAB code
To compute the pdf of the Exponential distribution with mean Lambda (λ).
y = exppdf(x,lambda).

7
Continuous Distribution

Example 6.3: The average time needed to open a computer is 2 minutes. Find the probability
that a computer will be opened (i) after 3 minutes, (ii) before 2 minutes, (iii) within 2 to 3
minutes. If a man is in queue for half - an – hour, what is the probability that he will be able to
open the computer- (iv) after 35 minutes, (v) before 35 minutes, and (vi) within 32 to 35
minutes.

Solution: Let, X be the time needed to open a computer. Given, 𝜆 = 2.


𝑥 3
(i) P (X > 3) = 𝑒 − 𝜆 = 𝑒 − 2 = 0.2231
𝑥 2
(ii) P (X < 2) = 1− 𝑒 − 𝜆 = 1− 𝑒 − 2 = 0.63212
2 3
(iii) P (2 < X < 3) = 𝑒 − 2 − 𝑒 − 2 = 0.14478
(iv) As he is in queue for 30 minutes, he will open the computer after (35−30) = 5
𝑥 5
minutes. P (X > 5) = 𝑒 − 𝜆 = 𝑒 − 2 = 0.08208
(v) He will open the computer before (35−30) = 5 minutes.
𝑥 5
P (X < 5) = 1− 𝑒 − 𝜆 = 1− 𝑒 − 2 = 1− 0.08208 = 0.91792
(vi) He will open the computer within (32−30) = 2 minutes to (35−30) = 5 minutes.
2 5
P (2 < X < 5) = 𝑒 − 2 − 𝑒 − 2 = 0.2858

8
Continuous Distribution

Rayleigh distribution:
It is a continuous probability distribution, named after the English Lord Rayleigh, used in:
• Communications theory: to fit multiple paths of dense scattered signals reaching a
server.
• Physical science: to model wind speed, wave heights and sound / light radiation.
• Engineering: to measure the lifetime of an object where the lifetime depends on
object’s age, such as resistors, transformers, and capacitors in aircraft radar sets.
• Medical imaging science: to model noise variance in magnetic resonance imaging.

Mathematically, the distribution of a Rayleigh variable X is defined as:


𝑥2
𝑥 −
f(x) = 𝜎2 𝑒 2𝜎2 ; x > 0, 𝜎 > 0.
𝑀𝑒𝑎𝑛(𝑎𝑣𝑒𝑟𝑎𝑔𝑒) 𝑥̅
Here, mode of the Rayleigh distribution, 𝜎 = = 1.253.
1.253

f(x)

Formulas to calculate probabilities:

𝑥2 𝑥2
∝ ∝ 𝑥 − −
P (X > x) = ∫𝑥 𝑓(𝑥) dx = ∫𝑥 𝑒 2𝜎2 dx = 𝑒 2𝜎2
𝜎2
𝑥2

P (X < x) = 1− 𝑒 2𝜎2

𝑥 2 𝑥 2
− 12 − 22
P (𝑥1 < X < 𝑥2 ) = 𝑒 2𝜎 − 𝑒 2𝜎

9
Continuous Distribution

Example 6.4: The average wind speed of a day is 4.5 (knots). Find the probability that in a
randomly selected day, the wind speed (i) will exceed 4 knots, (ii) will be less than 3 knots,
and (iii) will be between 2 to 5 knots.
𝑥̅ 4.5
Solution: Let, X be the wind speed. Given, 𝑥̅ = 4.5; then, 𝜎 = 1.253 = 1.253 = 3.59.
42

(i) P (X > 4) = 𝑒 2(3.59)2 = 0.5376
32

(ii) P (X < 3) = 1− 𝑒 2(3.59)2 = 0.2947
22 52
− −
(iii) P (2 < X < 5) = 𝑒 2(3.59)2 − 𝑒 2(3.59)2 = 0.4772

Example 6.5: The mode of the density of faded out signal is 0.5. Find the probability that the
density will be (i) more than 0.8, (ii) less than 0.4, and (iii) between 0.4 to 0.6.
Solution: Let, X be the density of faded out signal. Given, 𝜎 = 0.5.
0.82

(i) P (X > 0.8) = 𝑒 2(0.5)2 = 0.278
0.42

(ii) P (X < 0.4) = 1− 𝑒 2(0.5)2 = 0.2739
0.42 0.62
− −
(iii) P (0.4 < X < 0.6) = 𝑒 2(0.5)2 − 𝑒 2(0.5)2 = 0.2393

MATLAB code
To compute the Rayleigh pdf at each of the values in X using the corresponding scale
parameter, mode (B). X and B can be vectors, matrices, or multidimensional arrays that all
have the same size, which is also the size of Y.
y = raylpdf(x,B).

10
Continuous Distribution

Exercise 6
6.1. A radar unit is used to measure speeds of cars on a motorway. The speeds are normally
distributed with a mean of 90 km/hour and a standard deviation of 10 km/hr. What is the
probability that a car picked at random is travelling at more than 100 km/hour?

6.2. The length of time between charges of the battery is normally distributed with a mean of
50 hours and variance of 225 hours. Asif owns one of these computers and wants to know the
probability that the length of time will be between 50 and 70 hours.

6.3. Entry to BUET is decided by test. Scores of the test are N(500, 1000). Habib wants
admission in BUET. He knows that he must score better than at least 70% of the students.
Habib takes the test and scores 585. Will he be admitted to BUET?

11
Continuous Distribution

6.4. A large group of students took a test in Math and the final grades have a mean score of 70
and a standard deviation of 10. If we approximate the distribution of these grades by a normal
distribution, what percent of the students scored less than 80?

6.5. The average time needed to get service in a bank is 15 minutes. Find the probability that a
man will be served i) before 25 minutes, ii) between 12 to 18 minutes and iii) after 20 minutes.

6.6. The mode of the density of faded out signal is 3.59. Find the probability that the density
of a random signal will be i) between 2 to 5 and ii) more than 3.

12
Continuous Distribution

6.7. The average wind speed of a day is 0.6265 (knots). Find the probability that in a randomly
selected day, the wind speed will be i) less than 0.5 knots ii) exceed 0.8 knots and iii) between
0.4 to 0.6 knots.

6.8. In a city, it is estimated that the maximum temperature in June is normally distributed with
a mean of 23º C and a standard deviation of 5° C. Find the probability of temperature in this
month to reach a maximum of between 21° C and 27° C.

13
Continuous Distribution

6.9. The average time needed to send a signal from a server is 5 minutes. Find the probability
that a signal will be sent (i) before 310 seconds, and (ii) after 220 seconds.

6.10. The time taken to assemble a car in a plant is a random variable having N(20,4). What is
the probability that a car can be assembled at this plant in a period of time (i) less than 19.5
hours and (ii) between 20 and 22 hours?

14
Continuous Distribution

6.11. The annual salaries of employees in an organization are approximately normally


distributed with a mean of TK 50000 and a standard deviation of TK 20000. (i) What percent
of employees earn less than TK 40000? and (ii) What percent of employees earn between TK
45000 and TK 65000?

6.12. The average time needed to get service in a bank is 15 minutes. If the man waits for 30
minutes in queue, find the probability that he will be served i) within 35 to 50 minutes, ii) after
40 minutes and iii) before 1 hour.

15
Continuous Distribution

Sample MCQs
1. Let x be the random variable that represent the score is a normally distributed variable with
mean μ = 30 and standard deviation σ = 4. Find the probability that the score is less than 40.
a) 0.1487 b) 0.0062 c) 0.9938 d) None

2. The consumption of electricity per day (in MW) follows N (30, 16). Find the probability
that in a randomly selected day the consumption is greater than 21.
a) 0.0122 b) 0.9878 c) 0.1374 d) 0.0563

4. The average time needed to get a phone call is 1 hour. Find the probability that a call will
be got after 1 hour 30 minutes.
a) 0.2231 b) 0.7767 c) 0.0225 d) 0.5624

5. The average time needed to repair a mobile phone set is 2 hours. If a customer is in queue
for 1 hour, what is the probability that his set will be repaired within 1.5 hours?
a) 0.2212 b) 0.7789 c) 0.9349 d) 0.0327

6. The average time needed to get the service in a bank is 60 minutes. Find the probability
that a service will be provide between 80 minutes to 100 minutes.
a) 0.2212 b) 0.0755 c) 0.9349 d) 0.0327

7. The mode of the power of signal is 3.59. Find the probability that a random power of
signal will be between 2 to 5.
a) 0.4771 b) 0.8563 c) 0.3791 d) 0.0892

16

You might also like