You are on page 1of 37

PROBABILITY & STATISTICAL

INFERENCE LECTURE 3
MSc in Computing (Data Analytics)

Lecture Outline
A quick recap
Continuous distributions.
Question Time

A Quick Recap
Probability & Statistics
We want to make decisions
based on evidence from a
sample i.e. extrapolate
from sample evidence to a
general population
To make such decisions we
need to be able to
quantify our (un)certainty
about how good or bad
our sample information is.

Population
Representative
Sample
Sample
Statistic
Describe
Make
Inference
Some Definitions
An experiment that can result in different outcomes, even though it is
repeated in the same manner every time, is called a random
experiment.
The set of all possible outcomes of a random experiment is called
the sample space of an experiment and is denote by S

A sample space is discrete if it consists of a finite or countable
infinite set if outcomes.

A sample space is continuous if it contains an interval or real
numbers.

An event is a subset of the sample space of a random experiment.


Some Definitions
A sample space is discrete if it consists of a finite or
countable infinite set if outcomes.

A sample space is continuous if it contains an
interval or real numbers.

An event is a subset of the sample space of a
random experiment.
Probability
Whenever a sample space consists of n possible outcomes that are
equally likely, the probability of the outcome 1/n.

For a discrete sample space, the probability of an event E, denoted
by P(E), equals the sum of the probabilities of the outcome in E.

Some rules for probabilities:
For a given sample space containing n events E
1
, E
2
, E
3
, ........,E
n
1. All simple event probabilities must lie between 0 and 1:
0 <= P(E
i
) <= 1 for i=1,2,........,n
2. The sum of the probabilities of all the simple events within a
sample space must be equal to 1:



1 ) (
1
=

=
n
i
i
E P
Discrete Random Variable
A Random Variable (RV) is obtained by assigning a
numerical value to each outcome of a particular
experiment.

Probability Distribution: A table or formula that
specifies the probability of each possible value for
the Discrete Random Variable (DRV)

DRV: a RV that takes a whole number value only

Summary Continued
For Discrete RV we often have a mathematical formula which is
used to calculate probabilities,
i.e. P(x) = some formula
This formula is called the Probability Mass Function (PMF)
Given the PMF you can calculate the mean and variance by:




When the summation is over all possible values of x


2 2 2
) (
) (
o

=
=

x P x
x xP
Binomial Distribution General
Formula
This all leads to a very general rule for calculating binomial probabilities:

In General Binomial (n,p)
n = no. of trials
p = probability of a success
x = RV (no. of successes)




Where P(X=x) is read as the probability of seeing x
successes.

x n x
p p
x
n
x X P

|
|
.
|

\
|
= = ) 1 ( ) (
Binomial Distribution
If X is a binomial random variable with the
paramerters p and n then

) 1 (
) 1 (
2
p np
p np
np
=
=
=
o
o

Poisson Probability Distribution


Probability Distribution for Poisson
Where is the known mean:







x is the value of the RV with possible values 0,1,2,3,.
e = irrational constant (like t) with value 2.71828

The standard deviation , o, is given by the simple relationship;

o =

!
) (
x
e
x X P
x


= =

Continuous Probability Distributions


Continuous Probability Distributions
Experiments can lead to continuous responses i.e. values
that do not have to be whole numbers. For example:
height could be 1.54 meters etc.
In such cases the sample space is best viewed as a
histogram of responses.
The Shape of the histogram of such responses tells us what
continuous distribution is appropriate there are many.

Lifetime of Component
D
e
n
s
i t y
0.0 0.5 1.0 1.5 2.0 2.5
0
1
2
3
4
Waiting Time
D
e
n
s
i t y
0.0 0.2 0.4 0.6 0.8 1.0
0
. 0
0
. 2
0
. 4
0
. 6
0
. 8
1
. 0
1
. 2
Normal Distribution (AKA Gaussian)
The Histogram below is symmetric & 'bell shaped'
This is characteristic of the Normal Distribution
We can model the shape of such a distribution (i.e. the
histogram) by a Curve


Normal Distribution
The Curve may not fit the histogram 'perfectly' - but should be
very close
Normal Distribution - two parameters,

= mean, o = standard deviation,

The mathematical formula that gives a bell shaped symmetric
curve

f(x) = Height of curve at x =
2
2
2
) (
2
2
1
o

o
x
e
Normal Distribution
Why Not P(x) as before?
=> because response is continuous

What is the probability that a person sampled at random is 6
foot?

Equivalent question: what proportion of people are 6 foot?

=> really mean what proportion are
'around 6 foot' ( as good as the measurement device allows) -
so not really one value, but many values close together.

Example: What proportion of graduates earn 35,000?

Would we exclude 35,000.01 or 34,999.99?

Round to the nearest , 10, 100, 1000?

Continuous measure => more useful to get proportion
from 35,000 - 40,000
Some Mathematical Jargon:

The formula for the normal distribution is formally called
the normal probability density function (pdf)


The Shaded
portion of the
Histogram is the
Proportion of
interest
Can visualise this using the histogram of salaries.
Since the histogram of salaries is symmetric and bell shaped,
we model this in statistics with a Normal distribution curve.
Proportion = the
proportion of the
area of the curve
that is shaded
So proportions
= proportional area under the curve
= a probability of interest

Need;
To know , o
To be able to find area under curve
Area under a curve is found using integration in
mathematics.
In this case would need a technique called numerical
integration.
Total area under curve is 1.
However, the values we need are in Normal
Probability Tables.

The Tables are for a Normal Distribution with
= 0 and o = 1

this is called the Standard Normal
Can 'convert' a value from any normal to the standard
normal using standard scores (Z scores)
Value from any
Normal
Distribution
Standardiz
e

Corresponding
Value from Normal
= 0
o = 1
Z : score ed standardis
o

=
x
Standard Normal
Z scores are a unit-less quantity, measuring how far above/below
a certain score (x) is, in standard deviation units.

Example: A score of 35, from a normal distribution with
= 25 and o = 5.
Z = ( 35 25) / 5 => 10/5 = 2

So 35 is 2 standard deviation units above the mean

What about a score of 20 ?

Z = ( 20 - 25) / 5 => 5 / 5 = 1

So 20 is 1 standard unit below the mean
Z-Score Example
Positive Z score => score is above the mean
Negative Z score => score is below the mean


By subtracting and dividing by the o we convert any normal
to = 0, o=1, so only need one set of tables!


Z-Score Example

From looking at the histogram of peoples weekly receipts, a
supermarket knows that the amount people spend on shopping
per week is normally distributed with:

= 58
o = 15.

Example:
What is the probability that a customer sampled at random will
spend less than 83.50 ?

Z = ( x ) / o
= ( 83.50 - 58 ) / 15 => 1.7
Area from Z=1.7 to
the left can be read
in tables
From tables area less than
Z = 1.7 => 0.9554

So probability is 0.9554
Or 95.54%
What is the probability that a customer sampled at random will
spend more than 83.50 ?

Z = ( x ) / o
= ( 83.50 - 58 ) / 15 => 1.7
From tables area greater than
Z = 1.7 => 1- 0.9554 =
0.0446

So probability is 0.0446
Or 4.46%
Exercise
Find the proportion of people who spend more than
76.75
Find the proportion of people who spend less than
63.50

Note: The tables can also be used to find other areas
(less than a particular value, or the area between two
points)


Characteristics of Normal Distributions
Standard Deviation has particular relevance to
Normal distribution
Normal Distribution => Empirical Rule


68%
95%
99.7%
Between Z
(lower, upper)
%Area
-1,1 68 %
-2,2 95 %
-3,3 99.7 %
-, + 100%
The normal distribution is just one of the known
continuous probability distributions.
Each have their own probability density function,
giving different shaped curves.
In each case, we find probabilities by calculating
areas under these curves using integration.
However, the Normal is the most important as it
plays a major role in Sampling Theory.
Other important continuous probability
distributions include
Exponential distribution especially positively skewed
lifetime data.

Uniform distribution.

Weibull especially for time to event analysis.

Gamma distribution waiting times between Poisson
events in time etc.

Many others..

Summary Random Variables
There are two types discrete RVs and continuous
RVs
For both cases we can calculate a mean () and
standard deviation ()
can be interpreted as average value of the RV
can be interpreted as the standard deviation of the
RV

Summary Continued
For Discrete RV we often have a mathematical formula which is
used to calculate probabilities,
i.e. P(x) = some formula
This formula is called the Probability Mass Function (PMF)
Given the PMF you can calculate the mean and variance by:




When the summation is over all possible values of x


2 2 2
) (
) (
o

=
=

x P x
x xP
Summary Continued
For continuous RVs, we use a Probability Density Function (PDF)
to define a curve over the histogram of the values of the
random variables.
We integrate this PDF to find areas which are equal to
probabilities of interest.
Given the PDF you can calculate the mean and variance by:



Where f(x) is usual mathematical notation for the PDF



o - dx ) ( dx ) (
2 2
x f x x xf
} }


= =
Question Time

Next Week
Next week we will start with the practical part of
the course. We will move to Lab 1005 in Aungier
Street

You might also like