You are on page 1of 30

Statistika dan Probabilitas

Salvius Paulus Lengkong, S.Pd., M.Eng


AI – CS364
Uncertainty Management

Basic Probability Theory

• The concept of probability has a long history that goes back thousands
of years when words like “probably”, “likely”, “maybe”, “perhaps”
and “possibly” were introduced into spoken languages.
• However, the mathematical theory of probability was formulated only
in the 17th century.

• The probability of an event is the proportion of cases in which the


event occurs.
• Probability can also be defined as a scientific measure of chance.

21st September 2006 Bogdan L. Vrusias © 2006 2


AI – CS364
Uncertainty Management

Basic Probability Theory


• Probability can be expressed mathematically as a numerical index with a range between zero
(an absolute impossibility) to unity (an absolute certainty).

• Most events have a probability index strictly between 0 and 1, which means that each event has
at least two possible outcomes: favourable/beneficial outcome or success, and unfavourable
outcome or failure.

the number of successes


P success  
the number of possible outcomes

the number of failures


P failure  
the number of possible outcomes

21st September 2006 Bogdan L. Vrusias © 2006 3


AI – CS364
Uncertainty Management

Basic Probability Theory


• If s is the number of times success can occur, and f is the number of times failure can occur,
then
s
P success   p 
s f

f
P failure   q 
s f
• and p+q=1

• If we throw a coin, the probability of getting a head will be equal to the probability of getting a
tail. In a single throw, s = f = 1, s + f = 2, and therefore the probability of getting a head (or a
tail) is 0.5.

21st September 2006 Bogdan L. Vrusias © 2006 4


AI – CS364
Uncertainty Management

Conditional Probability
• Let A be an event in the world and B be another event. Suppose that events A and B are not
mutually exclusive/separated, but occur conditionally on the occurrence of the other.
• The probability that event A will occur if event B occurs is called the conditional probability.
• Conditional probability is denoted mathematically as p(A|B) in which the vertical bar represents
"given" and the complete probability expression is interpreted as
– “Conditional probability of event A occurring given that event B has occurred”.

the number of times A and B can occur


p A B  
the number of times B can occur

21st September 2006 Bogdan L. Vrusias © 2006 5


AI – CS364
Uncertainty Management

Conditional Probability
• The number of times A and B can occur, or the probability that both A and B will occur, is called
the joint probability of A and B. It is represented mathematically as p(AB). The number of
ways B can occur is the probability of B, p(B), and thus

p A  B 
p A B  
p B 
• Similarly, the conditional probability of event B occurring given that event A has occurred
equals

p  B  A
p  B A 
p  A

21st September 2006 Bogdan L. Vrusias © 2006 6


AI – CS364
Uncertainty Management

Conditional Probability
Hence
p  B  A  p  B A   p  A

and
p  A  B   p  B A   p  A

Substituting the last equation into the equation

p A  B 
p A B  
p B 
yields the Bayesian rule:

21st September 2006 Bogdan L. Vrusias © 2006 7


AI – CS364
Uncertainty Management

Bayesian Rule

p  B A  p  A
p A B  
p B 

where:
p(A|B) is the conditional probability that event A occurs given that event B has occurred;
p(B|A) is the conditional probability of event B occurring given that event A has occurred;
p(A) is the probability of event A occurring;
p(B) is the probability of event B occurring.

21st September 2006 Bogdan L. Vrusias © 2006 8


AI – CS364
Uncertainty Management

The Joint Probability


n n
 p A  Bi    p A Bi   p Bi 
i 1 i 1

A B1
B4

B3 B2

21st September 2006 Bogdan L. Vrusias © 2006 9


AI – CS364
Uncertainty Management

The Joint Probability


• If the occurrence of event A depends on only two mutually exclusive events, B and NOT B, we
obtain:

p(A) = p(AB)  p(B) + p(AB)  p(B)

where  is the logical function NOT.

• Similarly,

p(B) = p(BA)  p(A) + p(BA)  p(A)

• Substituting this equation into the Bayesian rule yields:

p  B A  p  A 
p A B  
p  B A   p  A   p  B A   p  A 

21st September 2006 Bogdan L. Vrusias © 2006 10


AI – CS364
Uncertainty Management

Bayesian Reasoning
• Suppose all rules in the knowledge base are represented in the following form:

IF E is true
THEN H is true {with probability p}

• This rule implies that if event E occurs, then the probability that event H will occur is p.

• In expert systems, H usually represents a hypothesis and E denotes evidence to support this
hypothesis.

21st September 2006 Bogdan L. Vrusias © 2006 11


AI – CS364
Uncertainty Management

Bayesian Reasoning
The Bayesian rule expressed in terms of hypotheses and evidence looks like this:

p E H   p H 
p H E  
p  E H   p  H   p  E H   p  H 
where:
p(H) is the prior probability of hypothesis H being true;
p(E|H) is the probability that hypothesis H being true will result in evidence E;
p(H) is the prior probability of hypothesis H being false;
p(E|H) is the probability of finding evidence E even when hypothesis H is false.

21st September 2006 Bogdan L. Vrusias © 2006 12


AI – CS364
Uncertainty Management

Bayesian Reasoning
• In expert systems, the probabilities required to solve a problem are provided by experts.

• An expert determines the prior probabilities (prob before any new/observed data) for possible
hypotheses p(H) and p(H), and also the conditional probabilities for observing evidence E if
hypothesis H is true, p(E|H), and if hypothesis H is false, p(E|H).

• Users provide information about the evidence observed and the expert system computes p(H|E)
for hypothesis H in light of the user-supplied evidence E. Probability p(H|E) is called the
posterior probability (prob given observed data) of hypothesis H upon observing evidence E.

21st September 2006 Bogdan L. Vrusias © 2006 13


AI – CS364
Uncertainty Management

Bayesian Reasoning
• We can take into account both multiple hypotheses H1, H2,..., Hm and multiple evidences E1,
E2,..., En. The hypotheses as well as the evidences must be mutually exclusive and
exhaustive/comprehensive.
• Single evidence E and multiple hypotheses follow:
p  E H i   p H i 
p H i E  
m
 p E H k   p H k 
k 1

• Multiple evidences and multiple hypotheses follow:

p E1 E2 . . . En H i   p H i 
p H i E1 E2 . . . En  
m
 p E1 E2 . . . En H k   p H k 
k 1

21st September 2006 Bogdan L. Vrusias © 2006 14


AI – CS364
Uncertainty Management

Bayesian Reasoning
• This requires to obtain the conditional probabilities of all possible combinations of evidences
for all hypotheses, and thus places an enormous/huge burden on the expert.

• Therefore, in expert systems, conditional independence among different evidences assumed.


Thus, instead of the unworkable equation, we attain:

p E1 H i   p E2 H i   . . .  p En H i   p H i 
p H i E1 E2 . . . En  
m
 p E1 H k   p E2 H k   . . .  p En H k   p H k 
k 1

21st September 2006 Bogdan L. Vrusias © 2006 15


AI – CS364
Uncertainty Management

Ranking Potentially True Hypotheses


• Let us consider a simple example:

– Suppose an expert, given three conditionally independent evidences E1, E2,..., En, creates
three mutually exclusive and exhaustive/comprehensive hypotheses H1, H2,..., Hm, and
provides prior probabilities for these hypotheses – p(H1), p(H2) and p(H3), respectively. The
expert also determines the conditional probabilities of observing each evidence for all
possible hypotheses.

21st September 2006 Bogdan L. Vrusias © 2006 16


AI – CS364
Uncertainty Management

The Prior and Conditional Probabilities


Hypothesis
Probability
i =1 i =2 i =3
•H1=dengoe p H i  0.40 0.35 0.25

p  E1 H i 
•H2=flu
0.3 0.8 0.5
•H3=malaria
p  E2 H i  0.9 0.0 0.7

p  E3 H i  0.6 0.7 0.9

•E1=headache, E2=cough, E3=high temp.


Assume that we first observe evidence E3(high temp). The expert system computes the posterior
probabilities of each hypothesis as:
21st September 2006 Bogdan L. Vrusias © 2006 17
AI – CS364
Uncertainty Management

The Prior and Conditional Probabilities


p  E3 H i   p H i 
p H i E3   , i = 1, 2, 3
3
 p E3 H k   p H k 
k 1
thus
0.6  0.40
p H1 E3    0.34
0.6  0.40 + 0.7  0.35 + 0.9  0.25
0.7  0.35
p H 2 E3    0.34
0.6  0.40 + 0.7  0.35 + 0.9  0.25
0.9  0.25
p H 3 E3    0.32
0.6  0.40 + 0.7  0.35 + 0.9  0.25

After evidence E3 is observed, belief in hypothesis H2 decreases and becomes equal to belief in
hypothesis H1. Belief in hypothesis H3 increases and even nearly reaches beliefs in hypotheses H1
and H2.

21st September 2006 Bogdan L. Vrusias © 2006 18


AI – CS364
Uncertainty Management

The Prior and Conditional Probabilities


Facts:
(pH1 0.4)
(pH2 0.35)
(pH3 0.25)

(pE3H1 0.6)
(pE3H2 0.7)
(pE3H3 0.9)

User interface:
Print all facts and solve all the following queries:
? pH1E3
? pH2E3
? pH3E3
21st September 2006 Bogdan L. Vrusias © 2006 19
AI – CS364
Uncertainty Management

The Prior and Conditional Probabilities


Suppose now that we observe evidence E1. The posterior probabilities are calculated as

p  E1 H i   p  E3 H i   p  H i 
p H i E1E3   , i = 1, 2, 3
3

hence
 p E1 H k   p E3 H k   p H k 
k 1

0.3  0.6  0.40


p H1 E1E3    0.19
0.3  0.6  0.40 + 0.8  0.7  0.35 + 0.5  0.9  0.25
0.8  0.7  0.35
p H 2 E1E3    0.52
0.3  0.6  0.40 + 0.8  0.7  0.35 + 0.5  0.9  0.25
0.5  0.9  0.25
p H 3 E1E3    0.29
0.3  0.6  0.40 + 0.8  0.7  0.35 + 0.5  0.9  0.25
Hypothesis H2 has now become the most likely one.

21st September 2006 Bogdan L. Vrusias © 2006 20


AI – CS364
Uncertainty Management

The Prior and Conditional Probabilities


After observing evidence E2, the final posterior probabilities for all hypotheses are calculated:

p E1 H i   p E2 H i   p  E3 H i   p H i 
p  H i E1E2 E3   , i = 1, 2, 3
3
hence  p E1 H k   p E2 H k   p E3 H k   p H k 
k 1

0.3  0.9  0.6  0.40


p  H1 E1E2 E3    0.45
0.3  0.9  0.6  0.40 + 0.8  0.0  0.7  0.35 + 0.5  0.7  0.9  0.25
0.8  0.0  0.7  0.35
p H 2 E1E2 E3   0
0.3  0.9  0.6  0.40 + 0.8  0.0  0.7  0.35 + 0.5  0.7  0.9  0.25

0.5  0.7  0.9  0.25


p H 3 E1E2 E3    0.55
0.3  0.9  0.6  0.40 + 0.8  0.0  0.7  0.35 + 0.5  0.7  0.9  0.25
Although the initial ranking was H1, H2 and H3, only hypotheses H1 and H3 remain under consideration after all
evidences (E1, E2 and E3) were observed.

21st September 2006 Bogdan L. Vrusias © 2006 21


AI – CS364
Uncertainty Management

Exercise
• From which bowl is the cookie?
To illustrate, suppose there are two bowls full of cookies. Bowl #1 has 10 chocolate chip and 30
plain cookies, while bowl #2 has 20 of each. Our friend Fred picks a bowl at random, and then
picks a cookie at random. We may assume there is no reason to believe Fred treats one bowl
differently from another, likewise for the cookies. The cookie turns out to be a plain one. How
probable is it that Fred picked it out of bowl #1?

(taken from wikipedia.org)

21st September 2006 Bogdan L. Vrusias © 2006 22


AI – CS364
Uncertainty Management

Solution
• Let H1 correspond to bowl #1, and H2 to bowl #2. It is given that the bowls are identical from Fred's point
of view, thus P(H1) = P(H2), and the two must add up to 1, so both are equal to 0.5. The datum D is the
observation of a plain cookie. From the contents of the bowls, we know that P(D | H1) = 30/40 = 0.75 and
P(D | H2) = 20/40 = 0.5. Bayes' formula then yields

• Before observing the cookie, the probability that Fred chose bowl #1 is the prior probability, P(H1), which
is 0.5. After observing the cookie, we revise the probability to P(H1|D), which is 0.6.

21st September 2006 Bogdan L. Vrusias © 2006 23


Analisis Kombinatorika
Permutasi dan Kombinasi
Factorial Notation

• Let n be a positive integer. Then, factorial n, denoted n! is defined as:


• n! = n(n - 1)(n - 2) ... 3.2.1.

Examples:
• We define 0! = 1.
• 4! = (4 x 3 x 2 x 1) = 24.
• 5! = (5 x 4 x 3 x 2 x 1) = 120.
Permutations

• The different arrangements of a given number of things by taking some or all at a time, are
called permutations.
Examples:
• All permutations (or arrangements) made with the letters a, b, c by taking two at a time are
(ab, ba,ac, ca, bc, cb).
• All permutations made with the letters a, b, c taking all at a time are:
( abc, acb, bac, bca, cab, cba)
Number of Permutations

• Number of all permutations of n things, taken r at a time, is given by:

• Examples:
• 6P2 = (6 x 5) = 30.
• 7P3 = (7 x 6 x 5) = 210.
• Cor. number of all permutations of n things, taken all at a time = n!.
An Important Result

• If there are n subjects of which p1 are alike of one kind; p2 are alike of another kind; p3 are


alike of third kind and so on and pr are alike of rth kind,  such that (p1 + p2 + ... pr) = n.

• Then, number of permutations of these n objects is =


Combinations

• Each of the different groups or selections which can be formed by taking some
or all of a number of objects is called a combination.
Examples:
1. Suppose we want to select two out of three boys A, B, C. Then, possible
selections are AB, BC and CA.
• Note: AB and BA represent the same selection.
2. All the combinations formed by a, b, c taking ab, bc, ca.
3. The only combination that can be formed of three letters a, b, c taken all at a
time is abc.
4. Various groups of 2 out of four persons A, B, C, D are:
AB, AC, AD, BC, BD, CD.
5. Note that ab ba are two different permutations but they represent the
same combination.
Number of Combinations

• The number of all combinations of n things, taken r at a time is:

Note:
• nCn = 1 and nC0 = 1.
• nCr = nC(n - r)
Example:

You might also like