You are on page 1of 60

CL2603: Engineering Computing and Statistics

Dr B M Parker

Brunel University London

Term 2 2021/2022

Dr B M Parker (Brunel) CL2603 1


Table of Contents

Logistics
Assessment
Resources

Dr B M Parker (Brunel) CL2603 1


Main Aims
To develop knowledge and skills of applied mathematics and statistical
concepts useful to solve engineering problems. This includes the ability to
develop simple mathematical models that represent experimental data set,
estimate relevant model parameters and assess model performance as well
as solving differential equations describing chemical engineering
phenomena.

This course will be split into two sections


Statistics : Dr Ben Parker
Numerical Methods : Dr Mike Warby

Dr B M Parker (Brunel) CL2603 2


Learning Outcomes

This Modular Block provides opportunities for students to demonstrate


knowledge and understanding (K) cognitive (thinking) skills (C) and other
skills and attributes (S) in the following areas:
Demonstrate knowledge and critical understanding of the analysis,
manipulation and interpretation of data using statistical techniques
Develop simple mathematical models to represent experimental data
set and estimate relevant model parameters
Identify suitable regression models and assess their performance in
explaining and predicting experimental data
Use software to perform statistical analysis

Dr B M Parker (Brunel) CL2603 3


Recommended Reading

Recommended textbooks for the course are available via the Talis Reading
list linked from blackboard.

Dr B M Parker (Brunel) CL2603 4


What is covered/ overview plan

Part 1 (Week 18) Review of Basic Statistical Concepts, Introduction


to Statistical Computing in R, Sampling
Part 2 (Week 20) Estimation and Confidence Intervals
Part 3 (Week 22) Hypothesis Testing
Part 4 (Week 24) Review of Linear Regression and model fitting
Part 5 (Week 26) Extending Linear Regression (Multiple Regression
and model building)
Part 6 (Week 28) Non-linear Regression

Dr B M Parker (Brunel) CL2603 5


Logistics
Some uncertainty at time of writing whether we will be allowed full
lectures on Campus. I hope to move to a more traditional lecture if
permitted. In the meantime
The course will have online asynchronous (i.e. recorded) lectures.
There is a slot in your timetable from 10-12 on Fridays, which you
might want to use to review the lecture material, but lectures and
slides will be made available earlier each week so that you can study
when you want.
I may occasionally ask you to do some extra reading or look at some
examples to supplement this lecture material.
From Friday 2pm-4pm we have a lab, where we will be working
through examples and using the R statistical programming language.
I will set some take home exercises each week to be done before the
next session (two weeks later).
It is essential you review the lectures before the labs, and attend the
lab. I will not be recording labs.
It is best you attend the lab in person. If you cannot attend in
Dr B M Parker (Brunel) CL2603 6
How this course will be taught and assessed

Each week I will provide some questions to allow you to practice the
material in the course; solutions will be given in lectures where there is
time and/or on the course blackboard.
It is important you devote some time to work through these examples; the
only way to improve at mathematics is to practice it. However, this work
is not assessed, although I am happy to talk about individual answers in
office hours.

Dr B M Parker (Brunel) CL2603 7


Assessment

The assessment of this course is as follows:


50% of the course will be assessed in a lab based assignment on
statistical analysis.
50% of the course will be assessed in a written exam.

Dr B M Parker (Brunel) CL2603 8


Resources

All documents for the course will be posted on blackboard,


http://blackboard.brunel.ac.uk/
This includes these slides, exercises, solutions, assignments, computer
lab exercises, and everything else that may be relevant.
This course assumes some knowledge of mathematics and statistics
from your previous courses; you may wish to review the notes for
these courses.
I will suggest books appropriate for each chapter as we proceed. A
link to these books can be found on blackboard.
I will have office hours every week (TBD)
You should also e-mail me (Ben.Parker@brunel.ac.uk) if you need
help, or to arrange a meeting outside of an office hour.
Any corrections, questions, or feedback are also very welcome!

Dr B M Parker (Brunel) CL2603 9


Part I

Statistical Concepts, Computing, and


Sampling

Dr B M Parker (Brunel) CL2603 10


Learning Outcomes

By the end of the first week, you will be able to:


interpret a written description of an experiment as appropriate
mathematical notation, and vice versa.
recognise situations where random variables can be represented by
known distributions (discrete and continuous Uniform, Binomial,
Geometric, Poisson, Exponential and Normal) and interpret
parameters of these distributions correctly.
calculate expectations and variance of random variables.
use computational software to find probabilities of events.

Dr B M Parker (Brunel) CL2603 11


Motivation

Let’s suppose we have some


hypothesis or belief about a chemical
process. We summarise this in some
model.
We gather some data. We can use
this data to
test whether the model is true
find out something about the
parameters of the model
make some prediction using the
model

From Santiago, Celine B., Jing-Yao Guo, and Matthew S. Sigman. ”Predictive and mechanistic
multivariate linear regression models for reaction development.” Chemical science 9.9 (2018):
Dr B M Parker (Brunel)
2398-2412. CL2603 12
Further motivation

Our data may come from physical experiments which can often be
expensive, difficult, or dangerous...
... so we will often use numerical experiments or simulation (which Dr
Shaw will talk about next week)
However we generate the data it’s very
unlikely that we can see every possible
chemical reaction that may occur, so we
assume that we see some randomly
chosen sample of all the data we could
have seen. We form our view about the
world based on this data, and in this
section of the course we’ll quantify how
much we can say about the world based
on this data.

Dr B M Parker (Brunel) CL2603 13


Terminology

We call the set of all possible experiment results we could have seen
the Sample Space
We call the set of results we did actually see the data.
Example: Accelerator mass spectrometry detector measures the number of
Carbon-14 ions in five sample ice cores chosen at random locations in
Antarctica.
The sample space is Ω = {0, 1, 2, 3, ...}
Our data might be 1, 2, 12, 0, 5
Based on our data, is there anything we can say about the amount of
Carbon-14 in Antarctica?

Dr B M Parker (Brunel) CL2603 14


Descriptive Statistics
It is of course interesting to provide some descriptive statistics from a
sample. In this course, I’ll demonstrate how to do these things in R which
we’ll meet in the labs.
Let’s looking at some real data; the file concrete.dat contains the
compression strength (Nmm−2 ) of 180 cubes of concrete made by our
chemical process.
The first thing we might do with any data set is examine it; we can do this
in R as follows:
>concrete <- read.table(‘‘concrete.dat’’)
This gives us an object with two columns, the experiment number and the
strength reading. To put this in a form we can work with, we do
strength<-concrete$V1
to get the data column into the strength object. We can display the data
by typing strength.
To get a preliminary feel about the data, we get a histogram by typing
hist(strength).
Dr B M Parker (Brunel) CL2603 15
We notice from this histogram that the graph is largely symmetrical,
although there are a few unusual low values which we might question; with
these values, the graph is very slightly skewed to the left.

Dr B M Parker (Brunel) CL2603 16


We may also wish to summarise the data numerically; let’s call our data
y1 , y2 , y3 , . . . , yn . Here we have y1 = 57.4, y2 = 59.6, . . . , y180 = 64.7
We can find the mean as
Pn
yi 57.4 + 59.6 + . . . + 64.7
ȳ = i=1 = = 61.10.
n 180

In R, we can do this by mean(strength).

Dr B M Parker (Brunel) CL2603 17


The median is the middle observation if we arrange the data in ascending
order; let y(1) be the smallest value, y(2) the second smallest, and suppose
we have 2m + 1 values. Then y(m) is the median value.
This is sometimes a more typical value, and used in different contexts from
the mean; one technical note is that if we have two middle values, i.e an
even number of data, then we take the median as the average of the
middle two values.
For the concrete, we can just use median(strength) to get 61.25; in this
example, the median is very close to the mean as the data are very
symmetrical.
The mode is sometimes used as the most frequently occurring value in
discrete data; for continuous data we might use the modal range; from the
histogram we can see this is 60-62 for the concrete example.

Dr B M Parker (Brunel) CL2603 18


Other useful ways to describe the data are:
Maximum: The highest value (yn ).
Minimum: The lowest value (y1 ).
Range: The maximum minus the minimum
The Upper Quartile (UQ): The value which 75% of the data lies
below.
The Lower Quartile (LQ): The value which 75% of the data lies
above.
The interquartile range: UQ-LQ.
We can also use summary(strength) to get a selection of summary
statistics for the data.

Dr B M Parker (Brunel) CL2603 19


We can see all of these graphically by using a boxplot; in R for the
concrete data we use boxplot(strength).

the top point is the maximum, the bottom the minimum; the upper and
lower quartiles are the edges of the rectangle, and the median the line in
the middle of the rectangle. Note that R decides that the the three lower
points are sufficiently different from the rest of the data to be outliers, and
draws these as circles.

Dr B M Parker (Brunel) CL2603 20


Spread of the data

Statistics is all about quantifying uncertainty; we introduced the range and


the interquartile range, but more often we will look at the variance (or
equivalently the standard deviation) of a sample, which is defined as

Definition (Sample variance)


Pn Pn 2
2 − ȳ )2
i=1 (yi ( − n(ȳ )2
i=1 yi )
s = = .
n−1 n−1

s (the square root of the variance) is then the sample standard deviation.
Note this is defined a little differently for the standard deviation of a
population or distribution we defined in chapter 1.
Variance is just a way of quantifying the spread of the data. Smaller
variance mean we know more about where the data lies, higher variances
mean we know less.

Dr B M Parker (Brunel) CL2603 21


Contents

Spread of the data

1 Random Variables
Selected Discrete distributions
Expectation and variance

Dr B M Parker (Brunel) CL2603 22


Random Variables

Let’s suppose for each element in our sample space we assign a number;
usually this will be some meaningful statistic based on that outcome.
For example, let us suppose we toss a fair coin twice and record the
output. Our sample space Ω = {HH, HT , TH, TT }. We could invent a
random variable X, which might be the number of heads in two tosses so

X (HH) = 2

X (HT ) = X (TH) = 1
X (TT ) = 0
If our variable takes only discrete values (e.g 0,1,2,. . . ) then it is a discrete
random variable otherwise it is a continuous random variable.

Dr B M Parker (Brunel) CL2603 23


Definition (Probability distribution function)
Let X be a discrete random variable. We let the probability distribution
function be
f (x) = P(X = x).

Example (Coins again)


We toss 2 coins and record the number of heads as before. We have
1
f (0) = P(X = 0) =
4
1
f (1) = P(X = 1) =
2
1
f (2) = P(X = 2) =
4
x 0 1 2
We will often write these as a table
P(X = x) 0.25 0.5 0.25

Dr B M Parker (Brunel) CL2603 24


Definition (Cumulative distribution function)
Let X be a discrete random variable. We let the cumulative distribution
function be X
F (x) = P(X ≤ x) = f (u).
u≤x

Coins again
We can write out the cumulative distribution for the coin example as
follows
1 1 3
F (1) = P(X ≤ 1) = P(X = 0) + P(X = 1) = + =
4 2 4
F (1.5) = P(X ≤ 1.5) = F (1) here
F (−17) = P(X < −17) = 0

Note we use lower case f to refer to the probability distribution function,


and capitalized F to refer to the cumulative distribution function.
Dr B M Parker (Brunel) CL2603 25
Common distributions

We will often make use of standard distributions whose properties we know


to model our processes.
It is often the case that if we approximate our process to a known
distribution, we can make our calculations a lot simpler (at the cost of
some accuracy).
For example, physical laws might tell us that radiation is a random
process, so we might model the number of radioactive particles as a
Poisson distribution which models rare events. Although it is unlikely the
process is exactly Poisson, we gain more than we lose.
For example, we might know the process is Poisson, and be required to
measure the mean radiation rate.
We’ll look at some common distributions in the next few slides.

Dr B M Parker (Brunel) CL2603 26


Definition (The geometric distribution)
A number of trials take place sequentially, and each trial is recorded as a
success or a failure with equal probability.
The geometric distribution specifies the number of trials, X , before the
first success.
It has a parameter p which represents the probability of success for each
trial.
The random variable, X , takes values 0, 1, 2, . . . It has a PDF
P(X = x) = f (x) = (1 − p)x p
We can show the CDF is P(X ≤ x) = F (x) = 1 − (1 − p)x+1
We will sometimes write X ∼ Geo(p) to show that the random variable X
has a geometric distribution with probability of success parameter p.
N.B. Different authors might define the geometric distribution as the number of
successes up to and including the first success. Be careful- this is equivalent, but
slightly different mathematically to what I use here.

Dr B M Parker (Brunel) CL2603 27


Example
An engineer has designed a storm water sewer system so that the yearly
maximum discharge will cause flooding on average once every 10 years.
This means that the probability each year that there will be a discharge
which causes flooding is 0.1. If it can be assumed that the maximum
discharges are independent from year to year, what is the probability that
there will be at least one flood in the next five years?

We can model this as a geometric distribution, with parameter p = 0.1.


We measure the number of years X before we have a ”successful“ flood.
f (0) = (1 − p)0 p = 0.1
f (1) = (1 − p)1 p = 0.9 × 0.1 = 0.09
f (2) = (1 − p)2 p = 0.9 × 0.9 × 0.1 = 0.081 etc...

Dr B M Parker (Brunel) CL2603 28


To answer the question, the probability that there will be at least one
flood in the next five years is equal to the probability that we have at most
4 ”failures” before the first flood, so can be written as

P(X ≤ 4) = F (4)

= 1 − (1 − p)5 = 1 − 0.95 = 1 − 0.59 = 0.41


This calculation is easy if we realise that our model is a geometric
distribution, use the properties of that distribution with the appropriate
distribution, and we do not have to work the CDF out from scratch every
time.

Dr B M Parker (Brunel) CL2603 29


Example (Poisson distribution)
λx e −λ
Poisson distribution: f (x) = x! . Used to model rare events which
occur independently: e.g.
The number of hurricanes per year in Texas.
The number of ions of a particular type in a small chemical sample.
The number of days per year that the atmosphere in a city is classed
as noxious.

Dr B M Parker (Brunel) CL2603 30


For example, we might know that the number of days X when the water
quality at a beach reaches toxic levels is Poisson distributed with
parameter λ = 5.
We can calculate the probability that there are X days per year with toxic
water quality as follows:

λ0 e −λ 1 × e −5
P(X = 0) = = = e −5 = 0.007
0! 1

λ1 e −λ 5 × e −5
P(X = 1) = = = 5e −5 = 0.033
1! 1

...

λ4 e −λ 625 × e −5
P(X = 4) = = = 0.175
4! 24

Dr B M Parker (Brunel) CL2603 31


Poisson Distribution, λ=5

0.15
0.10
f(x)

0.05
0.00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Dr B M Parker (Brunel) CL2603 32


Example (Binomial distribution)
f (x) = xn p x (1 − p)n−x

Binomial distribution:
(Remember that xn is the number of ways of choosing x things without


replacement from a set of n things, and xn = x!(n−x)!


n!

. )
Used to model the probability of x successes from a total number of n
trials with probability of each occurring independently of p. For example:
The number of times a biased coin with probability of heads p will
come up heads if tossed n times.
The number of times a year it will rain if it rains independently each
day with probability p and n = 365.
The number of ions of samples that contain a Carbon-14 ion if we
have n samples, each containing the ion with probability p
independently.

For example, we have 20 high specification parts for a centrifuge. What is


the distribution of the number of faulty parts if the probability of a faulty
part is 0.3,0.5, or 0.7 respectively?
Dr B M Parker (Brunel) CL2603 33
Binomial Distribution, p=0.3,n=20 Binomial Distribution, p=0.5,n=20

0.15
0.15

0.10
0.10
f(x)

f(x)

0.05
0.05
0.00

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

X X

Binomial Distribution, p=0.7,n=20


0.15
0.10
f(x)

0.05
0.00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Dr B M Parker (Brunel) CL2603 34


Some examples of density functions

Distribution Defined on pdf f(x) Notes


1
Uniform(a,b) a,a+1,. . . ,b b−a+1 X is equally likely
to be a,a+1,. . . ,b
n
p x (1 − p)n−x

Binomial(n,p) 0,1,. . . ,n x X is the number
of successes from
n trials each with
probability p.
Geometric(p) 0,1. . . (1 − p)x p Number of failures
before first success
λx e −λ
Poisson(λ) 0,. . . x! “Rare” events.

Dr B M Parker (Brunel) CL2603 35


Definition (Cumulative Distribution Function/ Probability Density
Function)
For continuous random variables, we can define the cumulative distribution
function Z x
F (x) = P(X ≤ x) = f (u)du.
−∞

The probability density function (pdf) can then be found as the derivative
of the cumulative distribution function,
d
f (x) = F (x)
dx
For the pdf to be valid, we must have that
f (x) ≥ 0 for all x.
R∞
−∞ f (u)du = 1.

Dr B M Parker (Brunel) CL2603 36


Differences between CDF and PDF

PDF:
Value greater than or equal to
zero.
Must integrate to 1.
To go from PDF→ CDF we
integrate under the curve.
CDF:
Function ranges from zero to 1
(F (−∞) = 0, F (∞) = 1)
Never decreases.
To go from CDF → PDF we
differentiate the function.

Dr B M Parker (Brunel) CL2603 37


Examples of continuous distributions

Example (Example 1: The Normal Distribution)


The Normal distribution (commonly called the Gaussian) with mean
parameter µ and variance parameter σ 2 has a probability density function
given by
1 (x−µ)2
f (x) = √ e 2σ2
2πσ 2
It is used to model many things in the sciences, and we shall see its use
later in the course for statistical inference.
Errors in experiments or measurements
Processes that move at random, such as the (logarithm of) changes
in prices of stocks, or changes in the high tide level from day to day.
“White noise” in electrical engineering.

Dr B M Parker (Brunel) CL2603 38


We can see how the PDF and CDF vary with the parameters for the
normal distributions:

Figure: CDF
Dr B M Parker(left) and PDF(right) for CL2603
(Brunel) normal distribution for varying39values of
The Standard Normal Distribution

The normal distribution with mean 0 and standard deviation 1, N(0, 1), is
called the standard normal distribution.
For the standard normal distribution, tables are available in all published
books of statistical tables (For example, table 4 of ‘New Cambridge
Statistical Tables’, 2nd Edition, by D. V. Lindley and W. F. Scott.) giving
the probability of the distribution in selected regions.
Most tables give areas under the curve to the left of a specified value,
i.e. the probability of observing a standard normal value less than or equal
to a specified value, P(Z ≤ z).

Dr B M Parker (Brunel) CL2603 40


Dr B M Parker (Brunel) CL2603 41
Usually, tables only give P(Z ≤ z) for positive values of z. For negative
values, we use the symmetry of the distribution to calculate the required
probability.
0.4 0.4

0.3 0.3

0.2

= 0.2

0.1 0.1

0.0 0.0
−z z
−4 −2 0 2 4 −4 −2 0 2 4

0.4

0.3

= 1 - 0.2

0.1

0.0
z
−4 −2 0 2 4

P(Z ≤ −z) = 1 - P(Z ≤ z)


So therefore the probability of an observation of a standard normal
population being less than −1.74 is 0.0409.

Dr B M Parker (Brunel) CL2603 42


We can now calculate probabilities for any region.
0.4 0.4 0.4

0.3 0.3 0.3

0.2

= 0.2

- 0.2

0.1 0.1 0.1

0.0 0.0 0.0


a b b a
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4

P(z1 ≤ Z ≤ z2 ) = P(Z ≤ z2 ) - P(Z ≤ z1 )


So therefore the probability of an observation of a standard normal
population being between −0.04 and 1.74 is 0.9591-(1-0.5160)=0.4751.

Dr B M Parker (Brunel) CL2603 43


Standardising a Normally Distributed Variable

The normal distribution has a particularly convenient property.


Consider a variable whose probability distribution has mean µ and
standard deviation σ. Suppose that we subtract µ from this variable and
then divide by σ, to obtain a transformed variable.
The transformed variable has mean 0 and standard deviation 1.
Furthermore, if the distribution of the original variable is normal, the
transformed variable has a standard normal distribution.
The operation of subtracting the mean (µ) of the distribution and dividing
by the standard deviation (σ) is called standardising the variable, and we
write
X −µ
Z= .
σ
By standardising, we can calculate probabilities for any normal distribution
using tables of the standard normal distribution.

Dr B M Parker (Brunel) CL2603 44


Example (SO2 )
Suppose that the atmospheric SO2 (sulphur dioxide) concentration at a
particular location is normally distributed with mean 25.8 µgm−3 and
standard deviation 5.5 µgm−3 . What is the probability of a SO2
concentration between 20 and 30 µgm−3 ?

If we denote the SO2 concentration by X then Z = (X − 25.8)/5.5 is a


variable with a standard normal distribution.
We require P(20 ≤ X ≤ 30).
When x = 20, z = −1.05. When x = 30, z = 0.76
P(20 ≤ X ≤ 30) = P(−1.05 ≤ Z ≤ 0.76) = 0.7764 − (1 − 0.8531) =
0.6295.

Dr B M Parker (Brunel) CL2603 45


0.06

0.04

0.02

0.0
10 20 30 40
=
0.4

0.3

0.2

0.1

0.0
−4 −2 0 2 4

Dr B M Parker (Brunel) CL2603 46


The Exponential distribution

The exponential distribution has probability density function


f (x) = λe −λx . It takes one parameter, λ It is commonly used to measure
the lifetime of components, or times between failures.
Example
A turbine blade has a lifetime exponentially distributed with λ = 0.5.
What is the probability the turbine lasts more than 5 years?

We know the pdf of the distribution is f (x) = λe −λx = 0.5e −0.5x .


The probability
R5 that the turbine lasts less than 5 years is
F (5) = 0 f (x)dx.
Thus the probability that the turbine lasts more than 5 years is 1 − F (5).
This can be calculated directly (exercise) and found to be 0.082..

Dr B M Parker (Brunel) CL2603 47


Figure: CDF (left) and PDF(right) for exponential distribution for varying values
of parameter λ.

Dr B M Parker (Brunel) CL2603 48


Selected continuous distributions

Distribution Defined on pdf f(x) Notes


1
Uniform(a,b) [a,b] b−a

Exponential(λ) x ≥0 λe −λx Time between failures


(x−µ)2
Normal(µ, σ 2 ) −∞ ≤ x ≤ ∞ √ 1 e 2σ 2 Commonly called the
2πσ 2
Gaussian in Engineer-
ing

Dr B M Parker (Brunel) CL2603 49


Expectation of a random variable

Definition
The expectation of a discrete random variable is
X
E (X ) = xf (x)
x∈Ω

and of a continuous random variable is


Z ∞
E (X ) = xf (x)dx.
−∞

Dr B M Parker (Brunel) CL2603 50


Example (Fair Die)
What is the expectation of the number shown, X, when rolling a fair die?

The number displayed on the die, X, has a discrete uniform distribution


between 1 and 6 (i.e. P(X = x) = 16 for x = 1, 2, 3, 4, 5, or 6.
Thus the expectation is
x=6
X X x
E (X ) = xf (x) =
6
x∈Ω x=1

1+2+3+4+5+6 21 1
= = =3
6 6 2

Dr B M Parker (Brunel) CL2603 51


Example (Lifetime of a turbine blade)
The lifetime of a turbine blade in days is distributed according to the
following PDF. (
20000
x3
, x > 100.
f (x) =
0, Otherwise.
What is it’s expected lifetime?

Dr B M Parker (Brunel) CL2603 52


Expectations of functions of variables

Note that if X is a random variable, if we take some function g(X) then


Y=g(X) is a random variable also and
X
E [g (x)] = g (x)f (x)dx, or
Z ∞
E [g (x)] = g (x)f (x)dx
−∞
for discrete and continuous random variables respectively.
It follows that, for random variables X and Y, and for constant c that

E (cX ) = cE (X )

E (X + Y ) = E (X ) + E (Y )
Moreover, if X and Y are independent, then E (XY ) = E (X )E (Y ). Note
that the converse is not true.

Dr B M Parker (Brunel) CL2603 53


Example (Resistors)
A random current I flows through a resistor with R = 50Ω. The
probability density function for the current is given as

2kx,
 0 ≤ x < 0.5.
f (x) = 2k(1 − x), 0.5 ≤ x ≤ 1

0, otherwise

What is the expected value of the voltage across the resistor?

Dr B M Parker (Brunel) CL2603 54


Variance
The variance of a random variable is a way of quantifying how much a
distribution varies about its expected value.
Definition (Variance and standard deviation)
The variance of a random variable X is defined as

Var(X ) = E [(X − µ)2 ].

The standard deviation of X is the (positive) square root of the variance.

For discrete random variable X, we can therefore write the variance of X as


X
σX2 = E [(X − µ)2 ] = (x − µ)2 f (x)

and for a continuous random variable as


Z ∞
σX2 = E [(X − µ)2 ] = (x − µ)2 f (x)dx.
−∞

Dr B M Parker (Brunel) CL2603 55


As a consequence of our definitions, we can find the variances of
combinations of random variables

Var(X + Y ) = Var(X ) + Var(Y )

Var(X − Y ) = Var(X ) + Var(Y )


Var(cX ) = c 2 Var(X )
Here X and Y are general random variables, and c is a constant.

Example (Hint:)
It’s often easier to know that the variance can be defined by

Var(X ) = E (X 2 ) − [E (X )]2

Dr B M Parker (Brunel) CL2603 56


Example (Fair die again)
What is the variance of the number displayed on a fair die?

Example (Lifetime of a turbine blade)


The lifetime of a turbine blade in days is distributed according to the
following PDF. (
20000
x3
, x > 100.
f (x) =
0, Otherwise.
What is the variance of its lifetime?

Dr B M Parker (Brunel) CL2603 57


We can calculate the mean and variance of some selected distributions as
follows:

Table: Selected Discrete distributions


Distribution Mean Variance
a+b (b−a+1)2 −1
Uniform(a,b) 2 12

Binomial(n,p) np np(1 − p)
1−p 1−p
Geometric(p) p p2

Poisson(λ) λ λ

Dr B M Parker (Brunel) CL2603 58


Table: Selected continuous distributions
Distribution Mean Variance
a+b (b−a)2
Uniform(a,b) 2 12

Exponential(λ) λ−1 λ−2


Normal(µ, σ 2 ) µ σ2

Dr B M Parker (Brunel) CL2603 59

You might also like