Stats For AI

Special Chapters on Artificial Intelligence
Lecture 1. Probability and Statistics
Cristian Gatu
1 Faculty of Computer Science
“Alexandru Ioan Cuza” University of Iaşi, Romania
MCO, MDS, MCL 2019–2020

Content
Random Variables and Probability Distributions

Definitions and axioms of probability
Random variables
Statistical measures
Location measures
Dispersion measures
Content

Random variables
Location measures
Dispersion measures
◮ Experiment – any operation whose outcome is subject

to a chance. E.g. spin of a coin or a die.
◮ Outcome Space – the set of all possible outcomes.

E.g. for the coin S = {h, t} and for the die
S = {1, 2, 3, 4, 5, 6}.
◮ Event – any subset of the outcome space. E.g. The

event of obtaining head is {h}. The event of obtaining an
even number is {2, 4, 6}.
◮ Outcome – one of the things that can happen in an

experiment.
Probability
◮ Suppose we throw a die.
– outcome space: S = {1, 2, 3, 4, 5, 6}.
– outcomes: A1 = [1], A2 = [2], . . . , A6 = [6].
– outcomes probability: P(A1 ), P(A2 ), . . . , P(A6 ).
◮ suppose P(A1 ) = P(A2 ) = · · · = P(A6 ) = 1/6.
◮ The event of obtaining an even number: E = {2, 4, 6}.
◮ P(E ) = P(A2 ) + P(A4 ) + P(A6 ) = 3/6 = 1/2.
◮ If each outcome is equally likely, then
Number of outcomes in E |E |
P(E ) = = .
Number of outcomes in S |S|
Probability
◮ The probability of an event A : 0 ≤ P(A) ≤ 1
◮ The complement of A: A = {e | e ∈ S, e ∈
/ A}.
◮ The probability of A : P(A) = 1 − P(A) .
Examples
Throw a die: S = {1, 2, 3, 4, 5, 6}.
– E = {2, 4, 6} the die is an even number.
– E = {1, 3, 5} the die is an odd number.
P(E ) = 3/6 = 0.5 and P(E ) = 1 − 0.5 = 0.5.
Independent and Dependent events
Two events, A and B, are independent if and only if the

event A provides no information about event B and vice-versa.
Examples
In a coin tossing experiment the fact that at the first toss we
get heads or tails does not provide any information about the
second toss.
Independent and Dependent events
The events A and B are dependent if knowledge of the

occurrence of one provides information about the other.
Examples
Consider the following events in the toss of a die:
A = {Observe an odd number}.
B = {Observe an even number}.
The events A and B are dependent since the one event
pre-excludes the other.
Bayes Theorem
P(A ∩ B)
◮ Conditional probability: P(A|B) =
P(B)
P(A|B)P(B)
◮ Bayes formula: P(B|A) =
P(A)
– P(B) prior probability

– P(B|A) posterior probability
Content

Random variables
Location measures
Dispersion measures
Discrete random variables
◮ A random variable is denoted by X and its values by x.
◮ Let X be a discrete variable, that is
P(X = xi ) = pi where i = 1, 2, . . . , n.
◮ X is a discrete random variable if

n
X
pi = 1 and 0 ≤ pi ≤ 1.
i=1
Continues random variables
◮ Random variables that takes on any value in an interval
are called continues.
◮ Let X be a continues variable such that
P(x1 ≤ X < x2 ) = p1
P(x2 ≤ X < x3 ) = p2
..
.
P(xn ≤ X < xn+1 ) = pn
◮ X is said to be a continues random variable iff
n
X
pi = 1 and 0 ≤ pi ≤ 1.
i=1
Probability density function
◮ The probability density function (pdf) of a
continues random variable X is a function that allocates
probabilities to all of the ranges of values that the
random variable can take.
◮ The pdf takes the form of a function of x, say f (x).
◮ Integrating f (x) over a range of values of x it gives the

probability that the random variable X lies in that
particular range.
◮ X is continues ramdom variable iff
Z
f (x) dx = 1.
all x
Let X have a pdf f (x) valid over the range a to b only.
If a ≤ x1 ≤ x2 ≤ b, then
Zx2
P(x1 ≤ X ≤ x2 ) = f (x) dx.
x1
✻
y y = f (x)
....... ..
...........
........... ..
...
...
.
...........................................................
✲
a x1 x2 b
x
Expectation
◮ Given a random variable X with pdf

– P(X = x) for X discrete
– f (X ) for X continues
P
 x P(X = x);
 discrete X ,
E (X ) = allR x
 x f (x)dx
 continues X .
all x
◮ The E (X ) is just the arithmetic mean of a discrete

probability distribution.
Example 1
x 0 1 2 3 4
1 1 1 1 1
P(X = x) 4 8 8 4 4
X
E (X ) = x P(X = x)
all x
1 1 1 1 1
=0× +1× +2× +3× +4×
4 8 8 4 4
= 17/8 = 2.125.
Example 2
3
f (x) = x(2 − x) for 0 ≤ x ≤ 2.
4
Z2 Z2
3
E (X ) = x f (x)dx = (2x 2 − x 3 )dx
4
0 0
3 h 2 3 1 4 i2 3 16 16
= x − x = ( − )
4 3 4 0 4 3 4
= 1.
f(x) = 3/4 x (2 − x), 0 <= x <= 2
0.6
0.4
f (x)
0.2
0.0
0.0 0.5 1.0 1.5 2.0
x
Expectation. Properties
◮ Let g (X ) be any function of a random variable X having

pdf
P(X = x) for discrete X and f (X ) for continues X
◮ The expectation of g (X ), written as E (g (X )), is defined

as:
P

 g (x) P(X = x); discrete x,
all x
E (g (X )) = R
 g (x) f (x)dx continues x.


all x
Expectation. Properties
1. E (a) = a.
2. E (aX ) = aE (X ).
3. E (f1 (X ) + f2 (X )) = E (f1 (X )) + E (f2 (X )).

Variance
◮ The variance of a probability distribution associated

with a random variable X is written as Var (X ) and is
defined by:
Var (X ) = E (X − µ)2

where µ = E (X ).
◮ The variance can be computed by:

2
Var (X ) = E (X 2 ) − E 2 (X ) where E 2 (X ) = E (X ) .
Example
x 0 1 2
Given the discrete distribution
1 1 1
P(X = x) 4 2 4
Find the variance of X .
Solution 2
Var (X ) = E (X 2 ) − E (X ) .
X 1 1 1
E (X ) = x P(X = x) = 0 × + 1 × + 2 × = 1.
4 2 4
X 1 1 1 3
E (X 2 ) = x 2 P(X = x) = 02 × + 12 × + 22 × = .
4 2 4 2
3 1
Var (X ) = − 12 =
2 2
Content

Random variables
Location measures
Dispersion measures
Content

Random variables
Location measures
Dispersion measures
Mean
◮ The arithmetic mean (or just mean) of a set of
numbers {x1 , x2 , . . . , xn } is denoted by x̄ and is defined
by:
n
1 1X
x̄ = (x1 + x2 . . . xn ) = xi .
n n i=1
◮ Consider a discrete frequency distribution taking values

{x1 , x2 , . . . , xn } with corresponding frequencies
{f1 , f2 , . . . , fn }. The mean x̄ is given by:
n
P
fi xi
i=1
x̄ = Pn .
fi
i=1
Examples
◮ Find the mean of the set {−3, −1, 0, 2, 3, 4}.
x̄ = (−3 − 1 + 0 + 2 + 3 + 4)/6 = 0.83.
◮ Find the mean of the following frequency distribution:
xi -3 -2 -1 0 1 2 3
fi 6 5 4 3 2 1 1
fi xi -18 -10 -4 0 2 2 3
P P
fi = 22, fi xi = −25, x̄ = −25/22 = −1.14.
Median
◮ The median of a set of numbers {x1 , x2 , . . . , xn } is

defined as the middled value of the set when arranged in
size order.
◮ If the set has an even number of items, then the median

is taken as the mean of the two middle two.
Remark
The mean has the disadvantage of taking extreme values into
account, especially for a small set of numbers.
Examples
1. Some wages arranged in size order are:

{28, 29, 32, 35, 36, 38, 41, 103}.
The mean is x̄ = 41.89 and the median x ∗ = 35.5.
2. Find the median of the set:

{65, 68, 68, 66, 64, 65, 65, 67}.
Arranging the set in order:

{64, 65, 65, 65, 66, 67, 68, 68}.
The median is given by: (65 + 66)/2 = 65.5.

Median
◮ Consider the discrete frequency distribution taking the

values {x1 , x2 , . . . , xn } with corresponding frequencies
{f1 , f2 , . . . , fn }.
◮ The median is given by the
1 + Pn f
i=1 i
th
2
value when the values are ranked.

Mode
◮ The mode of a set of values is defined as the one which

occurs with the greatest frequency.
Examples
1. The mode of the set {2, 3, 3, 1, 3, 2, 4, 5, 8, 3, 2, 4, 4, 3}
is 3.
2. The set {8, 6, 8, 5, 5, 7, 6, 8, 6, 9} has the two modes 6
and 8.
Remark
Note that for a set that has no repeated values the mode does
not exist.
Content

Random variables
Location measures
Dispersion measures
Range
◮ The Range of a set of numbers S = {x1 , x2 , . . . , xn } is

given by:
Range = max (S) − min (S).
Remark 1
The range uses the only extreme values !
Remark 2
The Range is the simplest of all measures of dispersion and
can be calculated very quickly and easily.
Range
Examples
1. The set {6, 5 , 7, 10 , 8, 9} has Range = 10 − 5 = 5.
2. The set {600, 610, 620, 600 , 610, 650 , 640, 650, 650}
has Range = 650 − 600 = 50.
3. The set {600, 610, 620, 200 , 610, 1000 , 640, 650, 650}
has Range = 800.
Standard Deviation
◮ The standard deviation is the measure of dispersion

used most widely in statistics. It is based on the
arithmetic mean.
◮ The standard deviation of a set of numbers

{x1 , x2 , . . . , xn } with mean x̄ is denoted by S and defined
as
v
uP n
u (xi − x̄)2 v u n
t
i
u1 X
S= =t x 2 − x̄ 2
n n i i
Standard Deviation
Examples
The set {3, 4, 6, 2} has x̄ = 15/4 = 3.75, x̄ 2 = 14.063 and

P n 2
i xi = 65. Thus,
65 21 1
2
S= − (3.75) = (16.25 − 14.063) 2 = 1.48
4
Variance
◮ The Variance of a set (or distribution) of numbers is
defined as the square of the standard deviation and is
denoted by S 2 .
◮ For a set of numbers:
Pn 2 n
2 i (xi − x̄) 1X 2
S = = x − x̄ 2 .
n n i i
◮ For a discrete frequency distribution:
− x̄)2 X fi xi2
P
2 (xi
i fiP
S = = P − x̄ 2 .
i fi i i fi

Stats For AI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stats For AI

Uploaded by

Copyright:

Available Formats

Special Chapters on Artificial Intelligence

Lecture 1. Probability and Statistics

“Alexandru Ioan Cuza” University of Iaşi, Romania

MCO, MDS, MCL 2019–2020

Random Variables and Probability Distributions

Random Variables and Probability Distributions

◮ Experiment – any operation whose outcome is subject

◮ Outcome Space – the set of all possible outcomes.

◮ Event – any subset of the outcome space. E.g. The

◮ Outcome – one of the things that can happen in an

◮ suppose P(A1 ) = P(A2 ) = · · · = P(A6 ) = 1/6.

◮ The event of obtaining an even number: E = {2, 4, 6}.

◮ P(E ) = P(A2 ) + P(A4 ) + P(A6 ) = 3/6 = 1/2.

◮ If each outcome is equally likely, then

◮ The probability of an event A : 0 ≤ P(A) ≤ 1

◮ The probability of A : P(A) = 1 − P(A) .

Two events, A and B, are independent if and only if the

The events A and B are dependent if knowledge of the

– P(B) prior probability

Random Variables and Probability Distributions

◮ A random variable is denoted by X and its values by x.

◮ Let X be a discrete variable, that is

◮ X is a discrete random variable if

◮ Integrating f (x) over a range of values of x it gives the

◮ Given a random variable X with pdf

◮ The E (X ) is just the arithmetic mean of a discrete

0.0 0.5 1.0 1.5 2.0

◮ Let g (X ) be any function of a random variable X having

P(X = x) for discrete X and f (X ) for continues X

◮ The expectation of g (X ), written as E (g (X )), is defined

3. E (f1 (X ) + f2 (X )) = E (f1 (X )) + E (f2 (X )).

◮ The variance of a probability distribution associated

◮ The variance can be computed by:

Random Variables and Probability Distributions

Random Variables and Probability Distributions

◮ Consider a discrete frequency distribution taking values

◮ Find the mean of the set {−3, −1, 0, 2, 3, 4}.

x̄ = (−3 − 1 + 0 + 2 + 3 + 4)/6 = 0.83.

◮ Find the mean of the following frequency distribution:

◮ The median of a set of numbers {x1 , x2 , . . . , xn } is

◮ If the set has an even number of items, then the median

1. Some wages arranged in size order are:

The mean is x̄ = 41.89 and the median x ∗ = 35.5.

2. Find the median of the set:

Arranging the set in order:

The median is given by: (65 + 66)/2 = 65.5.

◮ Consider the discrete frequency distribution taking the

◮ The median is given by the

value when the values are ranked.

◮ The mode of a set of values is defined as the one which

Random Variables and Probability Distributions

◮ The Range of a set of numbers S = {x1 , x2 , . . . , xn } is

Range = max (S) − min (S).

◮ The standard deviation is the measure of dispersion

◮ The standard deviation of a set of numbers

The set {3, 4, 6, 2} has x̄ = 15/4 = 3.75, x̄ 2 = 14.063 and

◮ For a discrete frequency distribution:

You might also like