You are on page 1of 42

教育部来华留学英语授课品牌课程

Probability and Statistics

Prof. Zheng Zheng

z. zheng 1
Probability of Bernoulli Trial
• p = P(S) on a single trial
• q=1–p
• n = number of trials
• k = number of successes

z. zheng 2
Flip Coin Experiments

– n identical trials Flip a coin 3 times


– Two outcomes:
• Success or Failure Outcomes are Heads or Tails
– P(S) = p; P(F) = q = 1 – p P(H) = 0.5; P(F) = 1-0.5 = 0.5
– Trials are independent A head on flip i doesn’t change
– x is the number of S’s in n trials P(H) of flip i + 1

z. zheng 3
Flip Coin Experiments
Results of 3 flips Probability Combined Summary
HHH (p)(p)(p) p3 (1)p3q0
HHT (p)(p)(q) p2q
HTH (p)(q)(p) p2q (3)p2q1
THH (q)(p)(p) p2q
HTT (p)(q)(q) pq2
THT (q)(p)(q) pq2 (3)p1q2
TTH (q)(q)(p) pq2
TTT (q)(q)(q) q3 (1)p0q3

z. zheng 4
Example
• Consider rolling a fair die eight times. Find the
probability that either 3 or 4 shows up five times ?
• Answer: In this case we can identify
"success"  A  { either 3 or 4 }   f 3    f 4 .
• Thus 1 1 1
P ( A)  P ( f 3 )  P ( f 4 )    ,
6 6 3
• Use n  8 , k  5 p  1/ 3.
• the desired probability is given by the above
equation

z. zheng 5
Example

Compute?

z. zheng 6
Properties of Bernoulli Trials
Let X k  " exactly k occurrence s in n trials" .
P ( X 0  X 1    X n )  1.
• Since Xi , X j are mutually exclusive,
n n
 n  k nk
P(X 0  X1    X n)  
k 0
P ( X k )     p q
k 0  k 
.

n
n  k nk
• Since (a  b) n
    a b ,
k 0 k 

( p  q)n  1

z. zheng
• Q: For a given n and p what is the most likely
value of k ? e.g. the most probable value of k that
maximizes Pn ( k )
• Proof: To obtain this value, consider the ratio
Pn ( k  1) n! p k  1 q n  k  1 ( n  k )! k ! k q
 k nk
 .
Pn ( k ) ( n  k  1)! ( k  1)! n! p q n  k 1 p

• If k (1  p )  ( n  k  1 ) p or k  ( n  1) p .
Pn ( k )  Pn ( k  1 ),
• Thus, at k  ( n  1) p , Pn ( k ) peaks. Pn(k)
n12, p 1/2.
k max  ( n  1) p 
k
z. zheng
• kmax the most likely number of successes in n trials, satisfy
( n  1) p  1  k max  ( n  1) p
q k max p
p   p ,
n n n
km
• Thus, lim  p.
n  n

• The ratio of the most probable number of successes to the


total number of trials in a Bernoulli experiment tends to p,
the probability of occurrence of A in a single trial.
– the classical definition of probability

z. zheng
Bernoulli’s theorem (Law of Large Numbers)

Let X i be independent, identically distributed Bernoulli random


Variables such that
P ( X i )  p, P ( X i  0)  1  p  q,
and let k  X 1  X 2    X n represent the number of “successes”
in n trials.

P  kn  p    npq .
2

i.e., the ratio “total number of successes to the total number of


trials” tends to p in probability as n increases.

z. zheng 10
Proof:
k
 p  is equivalent to ( k  np ) 2  n 2 2 ,
n
n n n

 ( k  np )
k 0
2
Pn ( k )  k
k 0
2
Pn ( k )  2 np  k Pn ( k )  n 2 p 2
k 0

n n n
n! n!

k 0
k Pn ( k )  
k 1
k
(n  k )!k!
p k nk
q  
k 1 ( n  k )!( k  1)!
p k q nk
n 1
n! n 1
(n  1)! i n 1i
 i 1 n i 1
p q  np  pq
i  0 ( n  i  1)!i! i  0 ( n  1  i )!i!

 np ( p  q ) n 1  np.

z. zheng
n n n
n! n!

k 0
k2
Pn ( k )  
k 1
k
( n  k )! ( k  1)!
p k n k
q  
k  2 ( n  k )! ( k  2 )!
p k n k
q
n
n!
 p k q n  k  n 2 p 2  npq .
k 1 ( n  k )! ( k  1)!

n n n

 (
k 0
k  np ) 2
Pn ( k )   n
k 2
P (
k 0
k )  2 np  n
k P ( k )  n
k 0
2 2
p

 n 2 p 2  npq  2np  np  n 2 p 2  npq.


n

 (
k 0
k  np ) 2
Pn ( k )   ( k  np
k  np  n 
) 2
Pn ( k )   ( k  np
k  np  n 
) 2
Pn ( k )

  ( k  np
k  np  n 
) 2
Pn ( k )  n 
2 2

k  np  n 
Pn ( k )

 n 2 2 P  k  np  n  .

z. zheng
Example
• An order of 104 parts is received. The probability
that a part is defective equals 0.1.
Estimate the probability that the total number of
defective parts is between 900 and 1100.
• Answer: p=0.1, n= 104 , k1=900, and k2=1100
k
 p  0.01, set   0.01
n

P k  p     pq 
0.1 0.9
 0.09.

n  n
2
10  0.01
4 2

z. zheng 13
Example
• Suppose 5,000 components are ordered. The probability
that a part is defective equals 0.1.
What is the probability that the total number of defective
parts does not exceed 400?
• Let Y k=‘k parts are defective among 5,000 component’
The desired probability is given by

Compute?

z. zheng 14 14
DeMoivre-Laplace Theorem
• If npq>>1, i.e. n→∞ with p held fixed, then for k
in the npq neighborhood of np,

 n  k nk 1  ( k  np ) 2 / 2 npq
  p q  e
k  2npq

• e.g. the probability of k occurrence in n trials


approaches a normal (Gaussian) distribution.
Pn(k)

k z. zheng 15
Gaussian Function
1  x2 / 2
g ( x)  e
2 x 1 x

• Its integral G ( x)   g ( y )dy  2 


 y2 / 2
e dy


• G(∞)=1; G(-x)=1-G(x)
1
• For large x, G ( x)  1  g ( x)
x

• Define: Error function erf(x)


1 x 1

 y2 / 2
erf ( x)  e dy  G ( x) 
2 0 2
z. zheng 16
Example
• A fair coin is tossed 1000 times.
• Find the probability Pa that heads will show 500 times
and the probability Pb that heads will show 510 times.
• Answer : p=q=0.5 n=1000
a) if k=500, k-np=0, then npq  5 10
1 1
Pa    0.0252
2 npq 10 5

b) if k=510, k-np=10, then


e 0.2
Pb   0.0207
1 0 5
z. zheng 17
Poisson Theorem
• If n→∞, p→0, but np→a
k
n! a
p k q nk  e a Poisson function
k ! ( n  k )! k!

• Approximate the probability of between k1


and k2 occurrences by
k2 k
( np )
P ( k1  k  k 2 )  e  np 
k  k1 k!

z. zheng 18
Poisson Function
Poisson Probabilities with Lambda = 4
0.20

0.15

0.10
C2

0.05

0.00

0 2 4 6 8 10 12 14 16
C1

The figure shows P[X=x] for a


Poisson function with  = 4.

z. zheng 19
Example

• An order of 3000 parts is received. The


probability that a part is defective equals 10 -3
• We wish to find the probability P{K>5} that
there will be more than five defective parts.
• P{K>5}=1-P{K  5}
5 k
( 3 )
P (k  5)  e  3   0 . 916
k  0 k!

• P{K>5}=0.084

z. zheng 20
Random Poisson Points

 n  k n k
P(k in t a)    p q where p  t a
k T

( nt a T ) k
P ( k )  e  nt a T

k!

ta

0 T
n points in (0,T) t z. zheng 21
Random Poisson Points
Suppose that n and T increase indefinitely but the
ratio =n / T remain constant.
An infinite set of points covering the entire t axis
from - to +  ,the probability that k of these points
are in an interval of length t a is:
( t a ) k
P ( k )  e  ta
k!

z. zheng 22
Points in Nonoverlapping Intervals
• Consider two nonoverlaping subintervals If
two intervals ta and tb
• ka of n points are in interval ta and kb are in the
interval tb. the probability P{ka and ta, ka and tb}
ka kb k3
n!  t a   t b   t a t b 
P(k a in t a, k b in t b )      1  
k a!k b!k 3! T   T   T T 

• Where k3=n-ka-kb
ta tb

t z. zheng 23
Points in Nonoverlapping Intervals
n
• When n and T approaches infinity, 
T
(  t a ) k a  tb (  t b ) kb
P ( k a in t a , k b in t b )  e   t a e
ka! kb!
which means:

P ( k a in t a , k b in t b )  P ( k a in t a ) P ( k b in t b )
independent
ta tb

t z. zheng 24
Density of Poisson Points
• The probability P(1 occurrence in t) is given by:
 t (t )
P(1 occurence)  e
1!
• When t →0 :
P (1 occurence)  t

• Density or rate of arrival:


P (1 occurrence )
  lim
t  0 t
i.e.  is the rate of arrival
z. zheng 25
Example

• When in a given stream there are an average of 3 trout


per 100 meters. What is the probability of seeing 5
trout in the next 100 meters, assuming it obeys the
Poisson probability model?
• How about in the next 50 meters, assuming a Poisson
probability model?

z. zheng 26
Example (cont.)

• The probability of seeing 5 trout in the next 100


meters

• The probability of seeing 5 trout in the next 50


meters

z. zheng 27
Examples of Poisson Probability
 Many common settings isolated in space or time:

– Phone calls that arrive at a switch per second.

– Customers that arrive at a service point per minute

– Number of bombs dropped per square kilometer during a war

– Number of accidents per hour at a given location

– Number of buy orders per minute for a certain stock

– Number of individuals who have a disease in a large population

– Number of plants of a given species per square kilometer

z. zheng 28
Possible Applications
 Construction incidents
 If Success probability,
 Traffic accidents/period
 Bombs in a war zone p, gets very small (p
 Emails at switch point 0)
 Diseases in a population  Number of trials, n,
 Assembly line defects
becomes very large (n
 Work absences
 Work related injuries ∞)
 People on a line at a bank  np stays fairly constant
 Sicknesses of students in a large school
→ λ=np
 Major derogatory reports in credit history
 Number of patents coming to a hospital
 Number of thunderstorms in a summer in a southern city
 Number of buy orders for a given stock per minute
 Number of visitors to recreation sites per year
z. zheng 29
V2 Rocket Hits
 169.25Km2 areas of South London in a grid (13 by 13 blocks)
 215 rockets were fired randomly into the grid = n
 P(a rocket hits a particular grid area) = 1/169 = 0.005917 = p
 Expected number of rocket hits in a particular area = 215 * 1/169 = 1.272
 How many rockets will hit any particular area? 0,1,2,… could be anything up to
169.
 The 1.272 is the  for the Poisson function:

#
h
i
t
s
exp(-λ)λ
P(# hits)  , # hits  0,1,2,...
# hits!

z. zheng 30
1
2

10

11

12

13 1 2 3 4 5 6 7 8 9 10 11 12 13

z. zheng 31
1
17/28
2

10

11

12
7,Q
13
1 2 3 4 5 6 7 8 9 10 11 12 13

z. zheng 32
1
2

10

11

12

13
1 2 3 4 5 6 7 8 9 10 11 12 13

z. zheng 33
• p = 1/169
• N = 215
• λ = 215 * 1/169 = 1.272
• Theoretical Probabilities:
exp(-λ)λx/x!
– P(X=0) = .280
– P(X=1) = .356
– P(X=2) = .227
– P(X=3) = .096
– P(X=4) = .031
– P(X=5) = .008
– P(X=6) = .002

z. zheng 34
Does the Theory Work?
Theoretical Sample Outcomes
Outcomes
Outcome Probability Number Sample Proportion Number of
of Cells cells
0 .280 47 .337 57
1 .356 60 .290 49
2 .227 38 .195 33
3 .096 16 .136 23
4 .031 5 .030 5
5 .008 1 .006 1
6 .002 0 .006 1

z. zheng 35
Theory vs. The Data
Scatterplot of Theory, Experiment vs Outcome
0.4 Variable
Theory
Experiment

0.3
Y-Data

0.2

0.1

0.0

0 1 2 3 4 5 6
Outcome

z. zheng 36
Example: Received Emails

9:00:00 9:00:01 9:00:02 9:00:03 9:00:04

Time line. One second intervals.


On average, an email arrives at ****@buaa.edu.cn every 0.1 seconds (assume).
Q: How many would you expect to see arrive in each one second interval?
10 = 1/0.1.
Actual number in any particular one second interval could be 0, 1, 2, … any number,
potentially huge (spam?).
The probability of a certain number of emails to arrive in each one second interval is a
Poisson function with parameter λ = 10.

z. zheng 37
Example

• Arrival rate of customers at a bank is


3.2 persons per hour
• What is the probability of 6
customers in a particular hour?
• e-3.2*3.26/6! = 0.0607890

z. zheng 38
Random Variable
• A finite single valued function X(.) that maps the set of all
experimental outcomes  into the set of real numbers R is
a r.v., if the set | X ( )  x  is an event ( F ) for every x
in R, and the prob. of the events {x=∞} {x=-∞} are 0.

• X is a r.v, if X 1 ( B )  F where B represents semi-definite


intervals of the form {  x  a} and all other sets that
can be constructed from these sets by performing the set
operations of union, intersection and negation any number
of times. 

A
X ( )
x B R z. zheng 39
• if X is a r.v, then
 | X ( )  x   X  x 
•  a  X  b  is also an event
• { X  a} X  b  are events,
 X  a c   X  a  is an event,
Thus  X  a   X  b   {a  X  b} is an event

•   a  1  X  a   { X  a } is also an event

n 1 n 

z. zheng 40
Probability Distribution Function (PDF)
• Denote P   | X ( )  x   F X ( x )  0 .
• FX (x) is said to the Probability Distribution
Function (PDF) associated with the r.v X. The
subscript X is to identify the r.v.
• If g(x) is a PDF, then it is nondecreasing, right-
continuous, e.g.
(i) g ( )  1, g ( )  0,
(ii) if x 1  x 2 , then g ( x 1 )  g ( x 2 ),
(iii) g ( x  )  g ( x ), for all x.

z. zheng 41
From the earlier definition of Fx(x), we have
(i) F X (  )  P   | X ( )    P ( )  1
and F X (  )  P   | X ( )     P ( )  0 .
(ii) If x 1  x 2 , then the subset ( , x1 )  ( , x2 ).
Consequently the event   | X ( )  x 1     | X ( )  x 2 ,
since X (  )  x 1 implies X (  )  x 2 . As a result
F X ( x1 )  P  X ( )  x1   P  X ( )  x 2   F X ( x 2 ),

implying that the probability distribution function is nonnegative


and monotone nondecreasing.
(iii) Let x  x n  x n 1    x 2  x1 , and consider the event
Ak    | x  X ( )  x k .
since  x  X ( )  x   X ( )  x    X ( )  x ,
k k
z. zheng 42

You might also like