Professional Documents
Culture Documents
Good morning
Probability
Axioms of Probability
Theorems on probability
Conditional probability
Bayes’ Theorem
1
4/24/2020
Probability
Random Experiment
An experiment with uncertain outcomes which means more than one
possible outcome for every trial.
Sample space
The set of all possible out-comes of an experiment is called the sample space.
Event
Every subset of a sample space is an event.
Probability
A measure of uncertainty defined from the sample space to [0,1].
Probability
Random Experiment
An experiment with uncertain outcomes which means more than one
possible outcome for every trial- Toss a coin .
Sample space
The set of all possible out-comes of an experiment is called the sample
space- S={H,T}.
Event
Every subset of a sample space is an event-A=event of heads={H}.
Probability
A measure of uncertainty defined from the sample space to [0,1]-
P(A)=1/2 but ?.
2
4/24/2020
Probability Approaches
Classical approach
If 'S' be the sample space, then the probability of occurrence of an event 'E' is defined as:
P(E) = n(E)/n(S) = number of outcomes favoring 'E'
number of outcomes in sample space 'S'
Frequency approach
Subjective approach
Axiomatic approach
A function from S to R satisfying the following axioms
P(A)>=0;
P(S)=1;
P(A+B)<=P(A)+P(B)
Types of events
Simple
Compound
Equally Likely
Mutually exclusive
Independent/Dependent
Exhaustive
Complimentary
3
4/24/2020
Probability
A coin is thrown 3 times .what is the probability that at least
one head is obtained?
Find the probability of getting a numbered card when a card
is drawn from the pack of 52 cards.
There are 5 green 7 red balls. Two balls are selected one by
one without replacement. Find the probability that first is
green and second is red.
What is the probability of getting a sum of 7 when two dice
are thrown?
Counting-Probability
Selection problems
Probability P[A]= n/m
Single event : Single element selected
4
4/24/2020
Probability-Selection problems
A coin is thrown 3 times .what is the probability that at least one head is obtained?
Probability-Selection problems
There are 5 green 7 red balls. Two balls are randomly
selected . Find the probability that both are red.
Solution:
Probability P[A]= n Cr/m Cr
Single event : Multiple element selected
n(red)=7, m(total)=12, r(selected)=2
7C2 (7 * 6) /(1* 2) 7
P( A)
12C2 (12 *11) /(1* 2) 22
5
4/24/2020
Good morning
Addition Theorem
If ‘A’ and ‘B’ by any two events, then the probability of
occurrence of at least one of the events ‘A’ and ‘B’ is given
by:
P(A or B) = P(A) + P(B) – P (A and B)
P(A B) = P(A) + P(B) – P (A B)
6
4/24/2020
Addition Theorem
Ex.: The probability that a contractor will get a contract is ‘2/3’
and the probability that he will get on other contract is 5/9 . If the
probability of getting at least one contract is 4/5, what is the probability
that he will get both the contracts ?
Solution:
A=Event of getting contract; A=Event of getting another contract.
Here P(A) = 2/3, P(B) = 5/9
P(A or B) = 4/5, (P(A and B) = ?
By addition theorem of Probability:
P(A or B) = P(A) + P(B) - P(A and B)
4/5 = 2/3 + 5/9 - P(A and B)
or 4/5 = 11/9 – P(A B)
or P(A B) = 11/9 – 4/5 = (55-36) / 45
P(A B) = 19/45
Multiplication Theorem
Let A and B be two independent events. Then
multiplication theorem states that,
P[AB]= P[A]. P[B].
Note: P[AB] can also be represented by P[A and B] or
P[A∩B].
7
4/24/2020
Multiplication Theorem
Example:
Let a problem in statistics be given to two students whose probability of
solving it are 1/5 and 5/7.
What is the probability that both solve the problem.
Solution:
Let A= event that the first person solves the problem.
B= event that the second person solves the problem.
It is given that P[A]=1/ 5; P[B]=5/7.
Since A and B are independent, using multiplication theorem
P[AB]= P[A]. P[B].
= 1/5*5/7= 1/7
Good morning
8
4/24/2020
Conditional Probability
Probability of dependent events is termed conditional
probability.
Let A and B be 2 events, A depending on B. Then,
P[A/B] = P[AB]/P[B]
Conditional Probability
Example:
Let a file contain 10 papers numbered 1 to 10. A paper is selected at random.
What is the probability that it is 10 given that it is at least 5.
Solution:
From the problem we can see that,
Sample space S ={1,2,3,4,5,6,7,8,9,10}
Event that number is 10 : A ={10}.
Event that number is at least 5: B ={5,6,7,8,9,10}.
A and B={10}.
P[A]= 1/10; P[B] =6/10; P[ AB] =1/10.
Therefore,
P[A/B] =
9
4/24/2020
Complement
Complement probability
P(AC)=1-P(A)
Bayes Theorem
Statement:
Let E1,E2,…,En be n complimentary events and B be any event.
Then
P( B / Ei ) P( Ei )
P( Ei / B)
n
P( B / Ei ) P( Ei )
i 1
10
4/24/2020
Good morning
11
5/15/2020
Good morning
Good Morning
Recall From
previous session:
Bayes’ Probability
Theorem
Conditional Theorems-
Addition/
Probability Multiplication
Good Morning
Today’s Sessions
Standard Discrete
distributions Four Things to
• Binomial/Poisson/ Discuss/Learn
Geometric
Probability Random
Distributions- variables-
Measures Types
1
5/15/2020
Probability
• Single event
Probability P[A]= n/m
• single element
• Single event
Probability P[A]= n Cr/m Cr
• r elements
Addition theorem • Two events
P[A or B]=P[A]+P[B]-P[A and B] • Either one is required
Probability
A speaks truth in 75% of cases and B in 80% of cases.
In what percentage of cases are they likely to
contradict each other, narrating the same incident?
Solution:
Probability
A speaks truth in 75% of cases and B in 80% of cases.
In what percentage of cases are they likely to
contradict each other, narrating the same incident?
Solution:
A and B contradict each other =
[A lies and B true] or [B true and B lies]
= P(A).P(B-lie) + P(A-lie).P(B)
[Please note that we are adding + at the place of OR]
= (3/5*1/5) + (1/4*4/5) = 7/20
= (7/20 * 100) % = 35%
2
5/15/2020
Probability
A speaks truth in 75% of cases and B in 80% of cases.
In what percentage of cases are they likely to
contradict each other, narrating the same incident?
Solution:
A and B contradict each other =
[A lies and B true] or [B true and B lies]
= P(A).P(B-lie) + P(A-lie).P(B)
[Please note that we are adding + at the place of OR]
= (3/5*1/5) + (1/4*4/5) = 7/20
= (7/20 * 100) % = 35%
Probability
What is the probability of getting 53 Mondays in a leap year?
1/7 or 2/7 or 3/7 or 1
Solution:
Probability
What is the probability of getting 53 Mondays in a leap year?
1/7 or 2/7 or 3/7 or 1
Solution:
1 year = 365 days . A leap year has 366 days
A year has 52 weeks. Hence there will be 52 Sundays for
sure.
52 weeks = 52 x 7 = 364days which makes it
366 – 364 = 2 days to discuss about.
3
5/15/2020
Probability
What is the probability of getting 53 Mondays in a leap year?
1/7 or 2/7 or 3/7 or 1
Solution:
1 year = 365 days . A leap year has 366 days
A year has 52 weeks. Hence there will be 52 Sundays for sure.
52 weeks = 52 x 7 = 364days which makes it
366 – 364 = 2 days to discuss about.
In a leap year there will be 52 Sundays and 2 days will be left.
These 2 days can be:
1. Sunday, Monday; 2. Monday, Tuesday;
3. Tuesday, Wednesday; 4. Wednesday, Thursday
5. Thursday, Friday; 6. Friday, Saturday; 7. Saturday, Sunday
Of these total 7 outcomes, the favourable outcomes are 2.
Hence the probability of getting 53 days = 2/7
Probability -Example
A person sells cone ice creams in a moving vehicle. Any
customer can buy 1 or more cups as per the following
distribution based on past data.
Number of 1 2 3 4 5
ice creams
Customers 175 115 63 32 15
How many customers of his next 100 will buy 2 ice creams?
Probability -Example
A person sells cone ice creams in a moving vehicle. Any
customer can buy 1 or more cups as per the following
distribution based on past data.
Number of 1 2 3 4 5
ice creams
Customers 175 115 63 32 15
Prob 0.4375 0.2875 0.1575 0.08 0.0375
How many customers of his next 100 will buy 2 ice creams?
P(X=2)=0.2875
So, 100*0.2875≈29
4
5/15/2020
Good morning
Random Variables
We can answer questions regarding a specified event through
probability, like how frequently that event is expected to
occur…
And we can handle any event discussion using probability.
Random Variables
We can answer questions regarding a specified event through
probability, like how frequently that event is expected to
occur…
And we can handle any event discussion using probability.
But do we have any idea about the random experiment as a
whole?
5
5/15/2020
Random Variables
Understanding a Random experiment is through random
variables.
A random variable is a function that maps the set of events to
Rn.
RANDOM VARIABLE
So, a r.v is a variable, because it takes many values and is random
because the values taken depend on which of the outcomes the
experiment results in.
Random Variable
For example if we're tossing two six sided dices,
S: { (1; 1) (1; 2) (1; 3) (1; 4) (1; 5) (1; 6)
(2; 1) (2; 2) (2; 3) (2; 4) (2; 5) (2; 6)
(3; 1) (3; 2) (3; 3) (3; 4) (3; 5) (3; 6)
(4; 1) (4; 2) (4; 3) (4; 4) (4; 5) (4; 6)
(5; 1) (5; 2) (5; 3) (5; 4) (5; 5) (5; 6)
(6; 1) (6; 2) (6; 3) (6; 4) (6; 5) (6; 6) }
6
5/15/2020
Random Variable
If we know the probabilities of a set of events, we can
calculate the probabilities that a random variable defined on
those set of events takes on certain values. For example
P(X = 2) = P((1; 1)) = 1/36
P(X = 5) = P((1; 4); (2; 3); (3; 2); (4; 1g) = 1/9.
Random Variables-Types
Good morning
7
5/15/2020
Random Variable
Probability Distributions :
The probability that x can take a specific value is p(x).
That is
P( X x) p( x) p x
p( x) 1
x
Random Variable-
Probability Distributions
The probability that x can take a specific
value is P( X x) p( x) p
x
Random Variable
Probability Distributions :
The probability that x can take a specific value is p(x).
That is P( X x) p( x) p x
p(x) is non-negative for all real x, p ( x) 0, x
The sum of p(x) over all possible values of x is 1, p( x) 1
x
8
5/15/2020
Random Variable-Measures
Expectation: E[ X ] xp( x) x f ( x) dx
x x
Variance: V [ X ] x 2 p ( x) x
2
f ( x) dx
x x
Random Variable-Example 1
Discrete r.v: probability mass function : p(x)=P(X=x)
Ex: Toss a coin twice S={HH,HT,TH,TT}
X=Number of heads={0,1,2}
p(0)=P{TT}=1/4
P(1)={TH,HT}=2/4
P(2)={HH}=1/4
E[X]=
V[X]=
P(X = 2 or X = 3)
9
5/15/2020
Random Variable-Example 3
Suppose a variable X can take the values 1, 2, 3, or 4.
The probabilities associated with each outcome are described
by the following table:
Outcome 1 2 3 4
Probability 0.1 0.3 0.4 0.2
Random Variables-Example 4
For the following data, find the measures , form the
histogram of pmf and cdf
Good morning
10
5/15/2020
Bernoulli
Geometric Binomial
Poisson
Bernoulli
Geometric Binomial
Poisson
Poisson
e x x
p( x)
x!
Geometric
Binomial p( x) pq x r
p( x) nCr p r q n r
Bernoulli
p x 1
p ( x)
q x0
11
5/15/2020
Discrete Distributions
A bag contains 50 balls, 20 blue and 30red. Four balls are
taken one after another with replacement. Probability of
getting 2 from each category?.
A bag contains 500 balls, 200 blue and 300red. Four balls are
taken one after another with replacement. Probability of
getting 2 from each category?
Discrete Distributions
A bag contains 50 balls, 20 blue and 30red. Four balls are taken
one after another with replacement. Probability of getting 2 from
each category?.
Binomial
A bag contains 500 balls, 200 blue and 300red. Four balls are
taken one after another with replacement. Probability of getting 2
from each category?
Poisson
A bag contains 50 balls, 20 blue and 30red. Balls are taken one
after another with replacement until we get a blue one.
Probability of getting blue in fourth selection?
Geometric
Binomial Distribution
Binomial.
12
5/15/2020
Poisson Distribution
Mean and Variance of Poisson rv are equal and is np
Geometric Distribution
Possible values are Y=1,2,3,…
Mean=1/p
Variance=(1-p)/p2.
Geometric Distribution
A coin has been weighted so that it has a 0.9 chance of
landing on heads when flipped. What is the probability that
the first time the coin lands on heads is the 3rd flip?
Solution:
X=Number of BT’s to get the first head.
p=0.9
P(X=3)=T*T*H= (0.1)2.*(0.9)=0.009
13
5/15/2020
Geometric Distribution
A coin has been weighted so that it has a 0.9 chance of
landing on heads when flipped. What is the probability that
the first time the coin lands on heads is after the 3rd flip?.
Solution:
X=Number of BT’s to get the first head.
p=0.9
P(X>3)=1-P(X<=3)=1-[P(X=1)+ P(X=2)+ P(X=3)]
=1-[(0.1)0.*(0.9)+(0.1)1.*(0.9)+(0.1)2.*(0.9)]
=1-[0.9+0.09+0.009]
=0.001
Good morning
14
5/15/2020
Good morning
Previous session
Random
Variables-
Types
Probability
Geometric distributions-
Measures
Poisson Binomial
This session
Binomial,
Poisson,
Geometric
Hyper
Simulation
geometric
Chebychev’s
Poisson Process
theorem
1
5/15/2020
Standard distributions
Binomial:
Fixed Number(n) of Independent Bernoulli trials.
Probability of success(p).
Probability of Failure(q).
Required number of successes(x).
p( x) nCr p r q n r
Standard distributions
Poisson:
Fixed Number(n) of Independent Bernoulli trials.
Probability of success(p).
Required number of successes(x).
n large, p small and np=λ, constant.
e x
p( x)
x!
Standard distributions
Geometric:
Repeated Independent Bernoulli trials before first success.
Probability of success(p).
Probability of Failure(q).
Required number of BT(x).
p ( x ) pq x 1
2
5/15/2020
Standard distributions
Hyper Geometric:
Fixed number (n) of Dependent trials/Number of draws .
The number of successes(A) in the population.
The number of observed successes(a).
The number of elements(N).
A N A
a n a
P( X a)
N
n
Geometric Distribution
Possible values are Y=1,2,3,…
Mean=1/p
Variance=(1-p)/p2.
Geometric Distribution
A coin has been weighted so that it has a 0.9 chance of
landing on heads when flipped. What is the probability that
the first time the coin lands on heads is the 3rd flip?
Solution:
X=Number of BT’s to get the first head.
p=0.9
P(X=3)=T*T*H= (0.1)2.*(0.9)=0.009
3
5/15/2020
Geometric Distribution
A coin has been weighted so that it has a 0.9 chance of
landing on heads when flipped. What is the probability that
the first time the coin lands on heads is after the 3rd flip?.
Solution:
X=Number of BT’s to get the first head.
p=0.9
P(X>3)=1-P(X<=3)=1-[P(X=1)+ P(X=2)+ P(X=3)]
=1-[(0.1)0.*(0.9)+(0.1)1.*(0.9)+(0.1)2.*(0.9)]
=1-[0.9+0.09+0.009]
=0.001
Hyper-Geometric Distribution
This is similar to Binomial distribution except that here the
trials are without replacements.
Recall that Binomial is like probability of taking 5 blue balls
from a bag with 10 blue and 15 white balls. But it was
mentioned that the trials were independent meaning the
drawing was with replacement. If we don’t replace the balls
selected, then the trials are dependent and now the probability
of choosing 5 blue balls is from Hyper-geometric.
Without
With
replacements
replacements
Given number of Binomial Hyper-geometric
draws distribution distribution
Hyper-Geometric Distribution
Its pmf A N A
a n a
P( X a )
N
Where: n
A is the number of successes in the population
a is the number of observed successes
N is the population size
n is the number of draws
4
5/15/2020
Hyper-Geometric Distribution-Example
A deck of cards contains 20 cards: 6 red cards and 14 black
cards. 5 cards are drawn randomly without replacement.
What is the probability that exactly 4 red cards are drawn?
6 14
4 1
P( X 4)
20
5
Hyper-Geometric Distribution-Example
A small voting district has 101 female voters and 95 male
voters. A random sample of 10 voters is drawn. What is
the probability exactly 7 of the voters will be female?
101 95
7 3
P ( X 7)
196
10
Hyper geometric
If a hyper geometric distribution is represented by
X~H(P,Q,n), where N=P+Q,
Its mean is nP
PQ
5
5/15/2020
Good morning
Poisson Process
Is a counting process, independent increments, stationary increments
A stochastic process {N(t), t 0} is a counting process if N(t)
represents the total number of events that have occurred in [0,
t]
Then {N(t), t 0} must satisfy:
N(t) 0; N(t) is an integer for all t
If s < t, then N(s) N(t)
For s < t, N(t) - N(s) is the number of events that occur in the
interval (s, t].
Poisson Process
Is a counting process with independent and stationary increments
A counting process has independent increments if, for any
0st uv N (t ) N ( s )
is independent of N (v) N (u )
6
5/15/2020
Poisson Process
A counting process {N(t), t 0} is a Poisson process with rate ,
> 0, if
N(0) = 0
The process has independent increments
The number of events in any interval of length t follows a
Poisson distribution with mean t (therefore, it has stationary
increments), i.e.,
e t t
n
P N t s N s n , n 0,1,...
n!
Poisson Process
A counting process {N(t), t 0} is a Poisson process
Good morning
7
5/15/2020
Chebychev’s theorem
Chebyshev’s theorem will show you how to use the mean and the
standard deviation to find the percentage of the total observations that
fall within a given interval about the mean.
The fraction of any set of numbers lying within k standard deviations
of the mean of those numbers is at least 1-[1/k2], where
the within number
k
the standard deviation
and k must be greater than 1
Chebychev’s theorem-Example
Use Chebyshev's theorem to find what percent of the values will fall between 123 and 179 for
a data set with mean of 151 and standard deviation of 14.
Solution:
We subtract 151-123 and get 28, which tells us that 123 is 28 units below the mean.
We subtract 179-151 and also get 28, which tells us that 151 is 28 units above the mean.
Those two together tell us that the values between 123 and 179 are all within 28 units of the
mean. Therefore the "within number" is 28.
So we find the number of standard deviations, k, which the "within
number", 28, amounts to by dividing it by the standard deviation:
k=the within number/the standard deviation=28/14=2
So now we know that the values between 123 and 179 are all within 28 units of the mean,
which is the same as within k=2 standard deviations of the mean. Now, since k > 1 we can
use Chebyshev's formula to find the fraction of the data that are within k=2 standard
deviations of the mean. Substituting k=2 we have:
1−[1/k2]=1−[1/22]=1−1/4=3/4
So 3/4 of the data(75%) lie between 123 and 179.
8
5/15/2020
Good morning
Discrete Simulation
Simulation is basically about mimicking a system to understand it.
We sort of observe the system through a set of variables called
state Variables and when the state variables change their values
only at countable number of points in time, its called Discrete
Simulation.
Most business processes can be described as a sequence of
separate, discrete, events. For example, a truck arrives at a
warehouse, goes to an unloading gate, unloads, and then
departs. To simulate this, discrete event modeling is often
chosen.
Discrete Simulation
Steps in a simulation study:
9
5/15/2020
Discrete Simulation
In discrete systems, the changes in the system state are discontinuous and
each change in the state of the system is called an event.The model used in
a discrete system simulation has a set of numbers to represent the state of
the system, called as a state descriptor. In this chapter, we will also learn
about queuing simulation, which is a very important aspect in discrete event
simulation along with simulation of time-sharing system.
Following is the graphical representation of the behavior of a discrete system
simulation.
Discrete Simulation
Key Aspects:
Entities − These are the representation of real elements like the parts of
machines.
Relationships − It means to link entities together.
Simulation Executive − It is responsible for controlling the advance time
and executing discrete events.
Random Number Generator − It helps to simulate different data
coming into the simulation model.
Results & Statistics − It validates the model and provides its performance
measures.
10
5/15/2020
Good morning
11
5/29/2020
Good morning
Previous session
Binomial,
Poisson,
Geometric
Hyper
Simulation
geometric
Chebychev’s
Poisson Process
theorem
Today’s topics
Continuous
r.v’s
• Measures Standardization
Normal Problems
Distribution
1
5/29/2020
(i ) f ( x) dx 1
(ii ) f ( x) 0
b
E ( X ) xf ( x) dx
a
b
E ( X ) x n f ( x) dx
n
2
5/29/2020
giving c=3/2.
1
3 x4
1 1
(b) E(X) 3 2
dx 0
xf ( x) dx x 2 x 2 4
1 1 1
1
3 x5
1 1
2 3 2 3
x
2
f ( x) dx x x dx
2 2 5 5
1 1 1
Var(X)=E(X^2)-[E(X)]^2=3/5-0=3/5
(c)
1 1
3 7
P( X 1 / 2) f ( x) dx x 2 dx
1 1
2 16
2 2
3
5/29/2020
Good morning
Normal Distribution
A normal variable X with mean
µ( -∞ < µ < ∞ ) and variance σ2 > 0 has a normal distribution
if its pdf is,
1 ( x )2
f ( x) exp for x
2 2 2
4
5/29/2020
Normal Distribution
A normal variable X with equal mean and differing
variance,
Normal Distribution
Differing parameters
5
5/29/2020
Normal
Table
Value
for
1.45
6
5/29/2020
45 50
Standardize: z1 0.5
10
62 50
z2 1.2
10
Prob=area from Table(1.2)
-area from Table(-0.5)
=0.8849-(1-0.6915)
=0.8849-0.3085=0.5764
7
5/29/2020
8
5/29/2020
9
5/29/2020
Good morning
10
6/5/2020
Good morning
Previous session
Continuous
r.v’s
• Measures Standardization
Normal Problems
Distribution
Today’s Session
Normal
approximation to
Binomial-
Continuity
Beta Uniform
distribution Distribution
Gamma Exponential
Distribution Distribution
1
6/5/2020
2
6/5/2020
3
6/5/2020
4
6/5/2020
5
6/5/2020
Good morning
Continuous distribution
1. Suppose in a quiz there are 30 participants. A question is
given to all 30 participants and the time allowed to answer it
is 25 seconds. Find the probability of participants responds
within 6 seconds?
2. Suppose a flight is about to land and the
announcement says that the expected time to land is 30
minutes. Find the probability of getting flight land between
25 to 30 minutes?
3. Suppose a train is delayed by approximately 60 minutes.
What is the probability that train will reach by 57 minutes to
60 minutes?
6
6/5/2020
Continuous distribution
4. The data that follows are 55 smiling times, in seconds, of
an eight-week old baby.
Uniform Distribution
Let us consider the baby smiling example. From the data we
can calculate the mean smiling time to be 11.49 and 6.23
seconds.
Since this is an entirely spontaneous activity which could be
termed completely random, we can use Uniform distribution
to approximate this.
Uniform Distribution
A random variable is uniformly distributed in the
interval(a,b) if its pdf is defined by
1
a xb
f ( x) b a
0 else
7
6/5/2020
Uniform Distribution
So, we need to form the uniform distribution.
From the table, the smallest value is 0.7 and the largest is
22.8. So, if we assume that smiling times, in seconds, is
uniformly distributed between(0,23), then by the definition
of the uniform distribution
1
a xb
f ( x) b a
0 else
1
f ( x)
23
Uniform Distribution
So, P(2<X<18)=
18 18
1 18 2 16
P(2 X 18) f ( x)dx 23 dx 23
23
2 2
Uniform Distribution
Can we solve this in any other logical way
8
6/5/2020
Uniform Distribution
Can we solve this in any other logical way?
Yes. Consider the following rectangle.
18
1 16
P(2 X 18) f ( x)dx (base) * (height ) (18 2) * 23 23
2
Uniform
The amount of time, in minutes, that a person must wait for
a bus is uniformly distributed between 0 and 15 minutes,
inclusive.
X U (0,15)
Good morning
9
6/5/2020
Exponential
Suppose the number of hits to your website follow a Poisson
distribution at a rate of 2 per day. Let T be the time (in days)
between hits.
Suppose that messages arrive to a computer server following
a Poisson distribution at the rate of 6 per hour. Let T be the
time in hours that elapses between messages.
Exponential
A random variable X follows exponential with rate
parameter λ if its pdf is defined by
e x x 0
f ( x)
0 else
E[ X ] 1 / ; V [ X ] 1 / 2
Exponential
If jobs arrive every 15 seconds on average λ=4
per minute, what is the probability of waiting less than or
equal to 30 seconds, i.e 0.5 min?
10
6/5/2020
Exponential
0.5
4x
P( X 0.5) 4e dx 1 e 2 0.86
0
Example
Accidents occur with a Poisson distribution at an average of 4
per week. i.e.λ=4
1. Calculate the probability of more than 5 accidents in any
one week
2. What is the probability that at least two weeks will elapse
between accident?
Solution
(i) X=number of accidents
Poisson with mean 4
P(X>5)=1-P(X<=5)
=1-{P(X=0)+ P(X=1)… P(X=5)}
e 4 40
P( X 0) e4
0!
(ii) T=Time between occurrences
Exponential with mean 4
P(T 2) 4e 4t dt e 8 0.00034
2
11
6/5/2020
Exponential
Memoryless property.
This distribution has a memoryless property, which means it
“forgets” what has come before it.
In other words, if you continue to wait, the length of time
you wait neither increases nor decreases the probability of an
event happening.
Let’s say a hurricane hits your island. The probability of
another hurricane hitting in one week, one month, or ten
years from that point are all equal. The exponential is
the only distribution with the memoryless property.
P X s t X t P X s for all s, t 0
Exponential
Applications
The exponential often models waiting times , Time
between events, Lifetime of objects…
“How much time will go by before a major hurricane hits the
Atlantic Seaboard?” or
“How long will the transmission in my car last before it
breaks?”.
Exponential
Applications
Young and Young (198) give an example of how the
exponential distribution can also model space between
events (instead of time between events). Say that grass seeds
are randomly dispersed over a field; Each point in the field
has an equal chance of a seed landing there. Now select a
point P, then measure the distance to the nearest seed. This
distance becomes the radius for a circle with point P at the
center. The circle’s area has an exponential distribution.
12
6/5/2020
Exponential Family
The exponential distribution is one member of a very large class
of probability distributions called the exponential families, or
exponential classes. Some of the more well-known members of this
family include:
The Bernoulli distribution,;The beta distribution,;
The binomial distribution (for a fixed number of trials),;
The Categorical distribution,
The Chi-squared distribution,;The Dirichlet distribution,;The gamma
distribution,
The Geometric distribution,;The inverse Gaussian distribution,
The lognormal distribution.;
The negative binomial distribution (for a fixed number of failures),
The normal distribution,; Poisson distribution
The von-Mises distribution,;The von-Mises Fisher distribution.
Good morning
E ( ) , random
If X1, . . ., Xk are i. i. d. Exponential,
variables, then X1 + · · · + Xk is a Gamma G(k , )
random variable.
13
6/5/2020
Gamma Distribution
A random variable is Gamma distributed With a shape
parameter k and a scale parameter θ,if its pdf is given by
Gamma dist r e x x r 1
f ( x) x0
gamma(r)
Gamma Function, Г(x) gamma( x) x n 1e x dx
0
Note: Г(x)=(x-1)!
Gamma Distribution
Gamma(shape,scale)
Beta Distribution
A random variable is Beta distributed with parameters α,β
(both shape parameters) if its pdf is given by
14
6/5/2020
Gamma-Beta
Examples
Suppose that on an average 1 customer per minute arrive at a
shop. What is the probability that the shopkeeper will wait
more than 5 minutes before
(i) both of the first two customers arrive, and
(ii) the first customer arrive?
Soln: X is Gamma
Let X denotes the waiting time in minutes until the second
customer arrives, then X has gamma distribution with r = 2
(as the waiting time is to be considered up to 2nd customer)
λ=1
P(X>5)=6exp(-5)=0.042.
Beta Distribution
Properties
The difference between the binomial and the beta is that the
former models the number of successes (x), while
the latter models the probability (p) of success.
Interpretation of α, β
You can think of α-1 as the number of successes and β-1
as the number of failures, just like n & n-x terms in
binomial.
15
6/5/2020
Problems
Uniform, Exponential, Gamma
The average amount of weight gained in winter is uniformly
distributed U(0,30). What is P(10,15).
In a quiz , there are 30 participants. Time allowed to answer
is 25 seconds. How many will answer within 6 secs.
Suppose sending a money order is a random event at a
particular post office and on an average a money order sent
every 15 minutes. What is the probability that a total of 10
money orders are sent in <3 hours?
Good morning
16
6/5/2020
Summary-five sessions
Probability.
Approaches.
Theorems –Addition, Multiplication.
Conditional Probability-Bayes Theorem.
Random Variables-Distributions-Measures.
Discrete RV-Binomial, Poisson,Geometric, Hyper-Geometric.
Continuous RV-Normal,Exponential,Uniform, Gamma,Beta.
Chebychev’s theorem.
Poisson process.
Simulation.
Summary-five sessions
Probability-
Approaches.
Chebychev’s Prob.Theorems
theorem.Poisson –Addition,
Process-Simulation Multiplication.
Good morning
17
6/19/2020
Good morning
Discrete
rv
Probability Cts rv
Others
PP,CT,S
1
6/19/2020
This session
Joint
Distributions
Properties of
Joint discrete
Mean and
and Joint
variance of
continuous .
sample mean
Random Vector
Let us play a game Toss a Die
You toss a coin 1 2 3 4 5 6
Toss H
I’ll Toss a dice A
1/12 1/12 1/12 1/12 1/12 1/12
2
6/19/2020
P( X xi , Y y j ) pij 0
satisfying p
i j
ij 1
Continuous
Discrete F ( x, y )
i:xi x j: y j y
pij
x y
Continuous F ( x, y) f ( w, z )dzdw
w z
Example-Joint Distribution
Air Conditioner Maintenance
A company that services air conditioner units in residences and
office blocks is interested in how to schedule its technicians in
the most efficient manner
The random variable X, taking the values 1,2,3 and 4, is the
service time in hours
The random variable Y, taking the values 1,2 and 3, is the
number of air conditioner units
3
6/19/2020
Example-Joint Distribution
X=service time
Joint p.m.f
Y=
number pij 0.12 0.18
i j
of units 1 2 3 4
0.07 1.00
Bivariate
BIVARIATE DISCRETE p( x, y) pX (x) p X / Y ( x / y)
pY (y)
Good morning
4
6/19/2020
Conditional density
Let X and Y denote two random variables with
joint probability density function f(x,y) and
marginal densities fX(x), fY(y) then
the conditional density of Y given X = x
f x, y
fY X y x
fX x
conditional density of X given Y = y
f x, y
fX Y x y
fY y
5
6/19/2020
Bivariate Normal
1
B-Nrl 1 Q x1 , x2
f x1 , x2 e 2
2 1 2 1 2
x 2 x1 1 x2 2 x2 2
2
2
1 1
1 1 2 2
Q x1 , x2
1 2
Bivariate-Continuous-Example
If X and Y are jointly distributed as given by
Bivariate-Continuous-Example
6
6/19/2020
Bivariate-Continuous-Example
Recall that
Bivariate-Continuous-Example
Similarly, the pdf of Y is
Good morning
7
6/19/2020
Covariance
The covariance of two random variables is a statistic that
tells you how "correlated" two random variables are. If two
random variables are independent, then their covariance is
zero. If their covariance is nonzero, then the value gives you
an indication of "how dependent they are".
For example, how height and weight of a person co-vary?
Will an increase in height result in increase of weight? If so,
by how much?
Covariance
Covariance is a measure of how much two random variables
vary together. It’s similar to variance, but where variance
tells you how a single variable varies, co variance tells you
how two variables vary together.
Covariance
The covariance of two random variables X and Y is
XY E X X Y Y
Which for application purposes can be simplified into
XY E XY X Y
8
6/19/2020
Covariance
Two ball pens are selected at random from a bag containing 3
blue, 2 red and 3 green pens. If X is the number of blue pens
selected and Y is the number of red pens selected, find
(i)the joint distribution of X and Y
(ii)P[(X,Y)εA)] where A is the region {(x,y), x+y<=1}
(iii) Covariance of X and Y
Covariance
The possible pair values of X and Y are
(0,0),(0,1),(0,2),(1,0),(1,1),(2,0).
Their joint pmf is given by 3C x * 2C y * 3C2 x y
Giving the table f ( x, y )
8C2
X Row
f(x,y) total
0 1 2 P(y)
Y 0 3/28 9/28 3/28 15/28
1 6/28 6/28 0 12/28
2 1/28 0 0 1/28
Column P(x) 10/28 15/28 3/28 1
total
Covariance E XY xyf ( x, y ) (0 * 0)
x y
3
28
9
(0 *1) ...
28
3
14
Covariance
10 15 3 3
X xf X ( x) 0 * 1* 2 *
x
28 28 28 4
15 3 1 1
Y yfY ( y) 0 * 1* 2 *
y
28 28 28 2
XY E XY X Y *
3 3 1 9
14 4 2 56
X Row
f(x,y) total
0 1 2 P(y)
Y 0 3/28 9/28 3/28 15/28
1 6/28 6/28 0 12/28
2 1/28 0 0 1/28
Column P(x) 10/28 15/28 3/28 1
total
9
6/19/2020
Covariance
Example
If two random variables X and Y have the joint pdf f(x,y)=3x
0<=y<=x<=1,
Find the covariance between X and Y.
Covariance
Example
If two random variables X and Y have the joint pdf f(x,y)=3x
0<=y<=x<=1,
Find the covariance between X and Y.
Covariance
Example
If two random variables X and Y have the joint pdf f(x,y)=3x
0<=y<=x<=1,
Find the covariance between X and Y.
10
6/19/2020
Covariance
Example
If two random variables X and Y have the joint pdf f(x,y)=3x
0<=y<=x<=1,
Find the covariance between X and Y.
Good morning
11
6/19/2020
Good morning
12
6/19/2020
Similarly
X2 Y X2 Y2
X2 Y X2 Y2
13
6/19/2020
Good morning
14
7/3/2020
Good morning
Previous session
Joint
Distributions
Properties of
Joint discrete
Mean and
and Joint
variance of
continuous .
sample mean
This session
Population and sample, random sample, parameters and
statistics
Null and Alternate Hypothesis, level of significance,
One sided and two sided tests of hypothesis on mean
t-distribution,
1
7/3/2020
Sampling
Population
Sampling
Sampling Techniques
Sampling
Types of sampling
Sampling
Types of sampling-Multistage
2
7/3/2020
Statistical Inference
Statistical Inferences refers to the process of selecting and using a
sample statistic to draw inference about a population parameter. It
is concerned with using probability concept to deal with
uncertainly in decision making.
Statistical Inference treats two different classes of problems
namely hypothesis testing and estimation.
Hypothesis Testing:-
Hypothesis Testing is to test some hypothesis about the
parent population from which the sample is drawn. It must be
noted test of hypothesis also includes test of significance.
Estimation:-
The estimation theory deals with defining estimators for
unknown population parameters on the basis of sample study.
Test of Hypothesis
Parameter and Statistics:-
The statistical constants of the population, namely mean
µ, variance σ2 which are usually referred to as parameters.
Statistical measures computed from sample
observations alone eg. mean (X), variance (S2) etc. are usually
referred to as statistic.
Hypothesis: Null and Alternate
Null hypothesis H0: Veg and non-veg people are equally
populated in the village.
Alternative hypothesis H1 : Veg and non-veg peoples are
not equally populated in the village
Test of Hypothesis
Errors in sampling:-
The main objective in sampling theory is to draw valid inferences about the
population parameters on the basis of the samples results. In practice we decide to
accept (or) to reject the lot after examining a sample from it. As such we have two types
of errors.
(i) type I error and (ii) type II error
Type I Error:-
A type I error is committed by rejecting the null hypothesis when it is true. The
probability of committing a type I error is denoted by α,where
α = prob (type I error)
= prob. (rejecting H0/when H0 is true)
Type II Error:-
A type II error is committed by accepting the null hypothesis is when it is false.
The probability of committing a type II error is denoted by β.
Where β = prob. (type II error)
= prob. (accepting H0/when H0 is false
3
7/3/2020
Good morning
t distribution
The t distribution has the following properties:
The mean of the distribution is equal to 0 .
The variance is equal to v / ( v - 2 ), where v is the degrees of freedom
(see last section) and v >2.
The variance is always greater than 1, although it is close to 1 when
there are many degrees of freedom. With infinite degrees of freedom,
the t distribution is the same as the standard normal distribution.
The following are the important Applications of the t-
distribution:
Test of the Hypothesis of the population mean.
Test of Hypothesis of the difference between the two means.
Test of Hypothesis of the difference between two means with dependent
samples.
Test of Hypothesis about the coefficient of correlation.
t distribution
Student's t-distribution (or simply the t-distribution) is any
member of a family of continuous probability distributions that
arises when estimating the mean of a normally
distributed population in situations where the sample size is small
and population standard deviation is unknown.
X
t
n
Definition. If Z ~ N(0, 1) and U ~ χ2(r) are independent, then
the random variable
Z
T
U /r
follows a t-distribution with r degrees of freedom. We
write T ~ t(r).
4
7/3/2020
Sampling
Test of Significance
Let us now discuss the various situations where we have
to apply different tests of significance. For the sake of
convenience and clarity these situations may be summed up
under the following 3 heads:
test of significance for attributes
test of significance for variables (large samples)
test of significance for variables (small samples)
Sampling
SMALL SAMPLES
Defn:
When the size of the sample (n) is less than 30, then the
sample is called a small sample.
The following are some important tests for small
samples.
student’s t-test
f – test
x – test
Sampling
Degrees of Freedom:-
Degrees of freedom is the no. of independent
observations in a set.
By degrees of freedom we mean the no. of classes in
which the values can be assigned arbitrarily (or) at will
without violating the restrictions (or) limitations placed.
Degrees of freedom = no. of groups – no. of constraints
5
7/3/2020
t TEST
The t test is based on the assumption that we are comparing
means .
The test statistic is defined by
| X |
t where
S / n
X X
2
S
n 1
t TEST-Example
An outbreak of Salmonella-related illness was attributed to
ice cream produced at a certain factory. Scientists measured
the level of Salmonella in 9 randomly sampled batches of ice
cream. The levels (in MPN/g) were:
0.593 0.142 0.329 0.691 0.231 0.793 0.519 0.392 0.418
Is there evidence that the mean level of Salmonella in the ice
cream is greater than 0.3 MPN/g?
t TEST-Example
Ha: 0.3
6
7/3/2020
t Table
t
t TEST-Example
t = 2.2051,
CALCULATED t VALUE
t test
Example:
A manufacturer of a kind of bulb claims that his bulbs have a
mean life of 25 months with a standard deviation of 5
months. A random sample of 6 bulbs gave the following
lifetimes. Is the claim valid.
24,26,30,20,20,18
7
7/3/2020
t test
Step 1: Ho: There is no significant difference between the
sample mean and the population mean.
Step 2: Dof = n-1 = 6-1 = 5; LOS = 5% =0.05.
t test
X
t
S/ n
S
[ x x] 2
n 1
t test
X X- =x . x2
24 1 1
26 3 9
30 7 49
20 -3 9
20 -3 9
18 -5 25
138 102
8
7/3/2020
t test
X
X
138
23
n 6
S
x 2
102
4.517
n 1 5
23 25
t 6 1.084
4.517
t test
Calculated value t = 1.084
Tabulated value t5,0.05 = 2.015
Since CV < TV, accept Ho.
Therefore, There is no significant difference
between the sample mean and the
population mean
9
7/3/2020
t TEST-Example
An outbreak of Salmonella-related illness was attributed to
ice cream produced at a certain factory. Scientists measured
the level of Salmonella in 9 randomly sampled batches of ice
cream. The levels (in MPN/g) were:
0.593 0.142 0.329 0.691 0.231 0.793 0.519 0.392 0.418
Is there evidence that the mean level of Salmonella in the ice
cream is greater than 0.3 MPN/g?
t TEST-Example
t TEST-Example
t = 2.2051,
CALCULATED t VALUE
10
7/3/2020
Independent t Test
Compares the difference between two means of two
independent groups.
The comparison distribution is a difference between means to
a distribution of differences between means.
Population of measures for Group 1 and Group 2
Sample means from Group 1 and Group 2
Population of differences between sample means of Group 1 and
Group 2
Independent t Test
Paired-Sample Independent t Test
Two observations from each Single observation from each
participant participant from two independent
groups
The second observation is
The observation from the second
dependent upon the first group is independent from the first
since they come from the since they come from different
same person. subjects.
Comparing a mean Comparing a the difference
between two means to a
difference to a distribution distribution of differences
of mean difference scores between mean scores .
Student’s t-test
The Student’s t-test compares the averages and standard deviations of two samples to see if there is a
significant difference between them.
( x1 – x2 ) Where:
t= x1 is the mean of sample 1
(s1)2 (s2)2 s1 is the standard deviation of sample 1
+ n1 is the number of individuals in sample 1
n1 n2
x2 is the mean of sample 2
s2 is the standard deviation of sample 2
n2 is the number of individuals in sample 2
11
7/3/2020
Paired t test
The t statistic for the paired t test is
d
t
Sd / n
where
d X1 X 2
d isthe average of deviation
S d is the s tan dard deviation of the deviation
12
7/3/2020
Good morning
13
7/10/2020
Good morning
Sampling
t-distribution, Techniques,
Mean parameters and
statistics
This session
Statistics-
Continues
Sampling
distribution of
x2-distribution
mean and
variance
1
7/10/2020
Recall
Sampling
Population
Enumeration Sampling
Test of
Estimation
Hypotheses
Point Interval
Estimation Estimation
Statistical Inference
Statistical Inferences refers to the process of selecting and using a
sample statistic to draw inference about a population parameter. It
is concerned with using probability concept to deal with
uncertainly in decision making.
Statistical Inference treats two different classes of problems
namely hypothesis testing and estimation.
Hypothesis Testing:-
Hypothesis Testing is to test some hypothesis about the
parent population from which the sample is drawn. It must be
noted test of hypothesis also includes test of significance.
Estimation:-
The estimation theory deals with defining estimators for
unknown population parameters on the basis of sample study.
Test of Hypothesis
Parameter and Statistics:-
The statistical constants of the population, namely mean
µ, variance σ2 which are usually referred to as parameters.
Statistical measures computed from sample
observations alone eg. mean (X), variance (S2) etc. are usually
referred to as statistic.
Hypothesis: Null and Alternate
Null hypothesis H0: Veg and non-veg people are equally
populated in the village.
Alternative hypothesis H1 : Veg and non-veg peoples are
not equally populated in the village
2
7/10/2020
Test of Hypothesis
Errors in sampling:-
The main objective in sampling theory is to draw valid inferences about the
population parameters on the basis of the samples results. In practice we decide to
accept (or) to reject the lot after examining a sample from it. As such we have two types
of errors.
(i) type I error and (ii) type II error
Type I Error:-
A type I error is committed by rejecting the null hypothesis when it is true. The
probability of committing a type I error is denoted by α,where
α = prob (type I error)
= prob. (rejecting H0/when H0 is true)
Type II Error:-
A type II error is committed by accepting the null hypothesis is when it is false.
The probability of committing a type II error is denoted by β.
Where β = prob. (type II error)
= prob. (accepting H0/when H0 is false
Sampling
Errors in sampling:-
Sampling
Critical Region:-
A region corresponding to a statistic in the sample space S
which lead to the rejection of H0 is called Critical Region (Or)
Rejection Region.
Those region which to the acceptance of H0 give us a region
called acceptance region.
Level of Signifance:-
The probability α that a random value of the statistic ‘t'
belongs to the critical region is known as the level of significance.
In other words, level of significance is the size of the type I error.
The level of significance usually employed in testing of hypothesis
are 5% and 1%.
3
7/10/2020
Sampling
One tailed test:-
A test of any statistical hypothesis where the alternative
hypothesis is one tailed (right tailed (or) left tailed) is called a
one tailed test.
Thus in a one tailed test, the rejection region will be
located in only one tail which may be depending upon the
alternative hypothesis formulated.
We assume that the null hypothesis
H0 : = 0 against the alternative hypothesis
H1 : > 0 (right tailed)
H1 : < 0 (left tailed) is called one tailed test
Sampling
Two tailed test:-
In a two tailed test the rejection region is located in
both the tails.
In a test of statistical hypothesis where the alternative
hypothesis is two tailed, we assume that the null hypothesis.
H0 : = 0
H1 : ≠ o [ > 0 (or) < 0 ]
Sampling
Procedure for testing of hypothesis :-
Set up the null hypothesis.
Choose the appropriate level of significance (either 5% or
1% level) and find the Degree of freedom.
Compute the test statistic and find the Table value
We compare the calculated value and tabulated value.
If C.V. < T.V, H0 is accepted at 5% or 1%
C.V. > T.V, H0 is rejected at 5% or 1%
4
7/10/2020
Sampling
Test of Significance
Let us now discuss the various situations where we have
to apply different tests of significance. For the sake of
convenience and clarity these situations may be summed up
under the following 3 heads:
test of significance for attributes
test of significance for variables (large samples)
test of significance for variables (small samples)
Sampling
SMALL SAMPLES
Defn:
When the size of the sample (n) is less than 30, then the
sample is called a small sample.
The following are some important tests for small
samples.
student’s t-test
f – test
x – test
Sampling
Degrees of Freedom:-
Degrees of freedom is the no. of independent
observations in a set.
By degrees of freedom we mean the no. of classes in
which the values can be assigned arbitrarily (or) at will
without violating the restrictions (or) limitations placed.
Degrees of freedom = no. of groups – no. of constraints
5
7/10/2020
Good morning
Sampling distribution
The distribution of a statistic(measure) obtained from
repeated random sampling from a population is a sampling
distribution.
If we draw 10 random samples from a data(population),
calculate mean for each sample, then the sequence of the 10
means is a sampling distribution of means. Likewise for other
measures.
Sampling distribution
Consider , tossing a dice .
The mean of a single throw
is (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5.
When we increase the sample size
to 2,3…, there is movement
towards normality
6
7/10/2020
Sampling distribution
The Sampling distribution of mean
Pumpkin A B C D E F
Weight
(in 19 14 15 9 10 17
pounds)
Good morning
Chi-square χ2 Distribution
A standard normal deviate is a random sample from the standard
normal distribution. The Chi Square distribution is the distribution
of the sum of squared standard normal deviates. The degrees of
freedom of the distribution is equal to the number of standard
normal deviates being summed. Therefore, Chi Square with one
degree of freedom, written as χ2(1), is simply the distribution of a
single normal deviate squared.
Consider the following problem: you sample two scores from a
standard normal distribution, square each score, and sum the
squares. What is the probability that the sum of these two squares
will be six or higher? Since two scores are sampled, the answer can
be found using the Chi Square distribution with two degrees of
freedom.
7
7/10/2020
Chi-square χ2 Distribution
The mean of a Chi Square
distribution is its degrees
of freedom. As the degrees
of freedom increases,
the Chi Square distribution
approaches a
normal distribution.
8
7/10/2020
9
7/10/2020
Chi square
Independence of Attributes
Chi-square test for independence of attributes:-
Defn: Literally, an attribute means a quality (or)
characteristic.
Ex: Sincere, honesty etc.
An attribute may be marked by its presence (or)
absence in a number of a given population
Chi square
Independence of Attributes
Two characters A and B are considered means, we ‘ll
have a 2 x 2 contingency table of observed frequencies
A a1 a2
B
b1 a b Row total
b2 C d Row Total
Column Total Column Total Grand Total
Chi square
Independence of Attributes
Null Hypothesis. H0: Attributes are independent
Alternative Hypothesis.
H1: Attributes are not independent
D.o.f. r = (c-1) * (r-1)
Where
c= no. of columns
r= no. of rows
10
7/10/2020
Chi square
Independence of Attributes-Example
In an anti-malarial campaign in an area, quinine was
administered to 812 persons out of a total population of
3248. The number of fever cases is below. Discuss the
usefulness of quinine in the campaign.
Chi square
Independence of Attributes-Example
Step 1: Ho: Quinine is not effective.
Step 2: Dof = [r-1][c-1]
=[2-1][2-1] = 1;
LOS = 5% =0.05.
Step3: Calculated Value.
The expected Frequencies are
Side Effect Fever No Fever
Treatment
Quinine [ 812 *240]/ [812*3008]/32 812
3248 =60 48=752
No Quinine [2436/240]/32 [2436*3008]/3 2436
48=180 248=2256
240 3008 3248
Chi square
Independence of Attributes-Example
The chi-square table
11
7/10/2020
Chi square
Independence of Attributes-Example
Calculated value 2 =38.39
Therefore,
Quinine is effective.?
Good morning
12
7/10/2020
X 1200 1250 1200
P[ X 1250] P
250 250
60
60
60
P[ z ] P[ z 1.55] 0.06
5
Good morning
13
Probability and Statistics
Dr.J.Vijayarangam
BITS Pilani jvijayarangam@wilp.bits-pilani.ac.in
Pilani|Dubai|Goa|Hyderabad
Previous Session
Types of errors
Sampling Distributions-Mean and Variance
Chi-Square Distribution
Central Limit Theorem
This Session
Point estimates
of mean and
variance,
maximum
likelihood
Confidence
intervals
Estimation
In studying a population, estimation is a route and it is about
estimating the parameter values of the population using the
statistic value of a sample.
n k 1
Point Estimation
One measure for the quality of an estimator X^ is its bias or how far
off its estimate is on average from the true value XX:
bias(X^)=E[X^]−X
where the expected value is over the randomness involved in X^.
If the model has a high bias, its predictions are off, which
corresponds to underfitting.
18 21 17 16 24 20
T 19.33
6
n
1
S2
6 1 k 1
( X k 19.33) 2
8.67
Maximum Likelihood
The likelihood of a set of data is the probability of obtaining
that particular set of data, given the chosen probability
distribution model.
This expression contains the unknown model parameters. The
values of these parameters that maximize the sample
likelihood are known as the Maximum Likelihood
Estimates or MLEs.
Likelihood(θ )= probability of observing the given data as a
function of ‘θ ’.
Maximum Likelihood
The maximum likelihood estimate (mle) of θ is that value of θ
that maximises likelihood(θ).
It is defined as n
L( ) f ( xi / )
i 1
n
log L( ) log f ( xi / )
i 1
Maximum Likelihood
Example 1
MLE P( x | , ) L( , | x)
Suppose we have x=32. If we assume mean=28 and SD=2, then , the above
equation gives L=0.03
Maximum Likelihood
Example2
Consider a sample 0,1,0,0,1,0 from a binomial distribution, with the form
P[X=0]=(1-p), P[X=1]=p. Find the maximum likelihood estimate of p.
Soln :
L(p)=P[X=0] P[X=1] P[X=0] P[X=0] P[X=1] P[X=0]
=(1-p) p (1-p) (1-p) p (1-p)
=(1-p)3p2.
Log L(p)=log[(1-p)3p2.]=3log(1-p)+2logp
LogL( p) 3 2 3p 2 2 p
0 means, 0 0 p 2/5
p 1 p p p(1 p)
.
Tests for two proportions
Example 2
Suppose the previous example is stated a little bit differently.
Suppose the Acme Drug Company develops a new drug,
designed to prevent colds. The company states that the drug
is more effective for women than for men. To test this claim,
they choose a a simple random sample of 100 women and
200 men from a population of 100,000 volunteers.
At the end of the study, 38% of the women caught a cold; and
51% of the men caught a cold. Based on these findings, can
we conclude that the drug is more effective for women than
for men? Use a 0.01 level of significance.
Tests for two proportions
Example 2
The first step is to state the null hypothesis and an alternative
hypothesis.
Null hypothesis: P1 >= P2
Alternative hypothesis: P1 < P2
Note that these hypotheses constitute a one-tailed test. The null
hypothesis will be rejected if the proportion of women
catching cold (p1) is sufficiently smaller than the proportion of
men catching cold (p2).
Tests for two proportions
Example 2
Using sample data, we calculate the pooled sample proportion
(p) and the standard error (SE). Using those measures, we
compute the z-score test statistic (z).
p = (p1 * n1 + p2 * n2) / (n1 + n2) = [(0.38 * 100) + (0.51 * 200)] /
(100 + 200) = 140/300 = 0.467
Solution:
Step1: 4% apples are defective.
Step2: Standard error pq 0.96 * 0.04
S.E 0.008
n 600
Step3: 95% CI is
X 1.96 S .E 0.96 1.96 * 0.008
0.9443 to 0.9757
Step 4: With n=600, the boundary [for good ones] is
[0.9443*600, 0.9757*600]=[567,585]
Since the number of defectives is [15,33] and 36 is outside, reject H0
THANKS
Probability and Statistics
Dr.J.Vijayarangam
BITS Pilani jvijayarangam@wilp.bits-pilani.ac.in
Pilani|Dubai|Goa|Hyderabad
Previous Session
Point estimates
of mean and
variance,
maximum
likelihood
Confidence
intervals
This Session
Correlation
Regression
Tests of Hypothesis for correlation
BITS Pilani
Pilani Campus
Correlation
Correlation
Measures the degree of association between two interval
scaled variables analysis of the relationship between two
quantitative outcomes , e.g., height and weight,
Correlation
Graphically, we plot them ,called Scatter plot to find correlation.
Correlation
Graphically, we plot them ,called Scatter plot to find correlation.
Correlation
Karl Pearson’s correlation coefficient
Karl Peason’s, N XY X Y
r
N X 2 X N Y 2 Y
2 2
X 1 2 3 4 5
Y 2 5 3 8 7
X Y XY x2 y2
1 2 2 1 4
2 5 10 4 25
3 3 9 9 9
4 8 32 16 64
5 7 35 25 49
15 25 88 55 151
5 * 88 15 * 25
r 0.8062
5 * 55 (15) 2 * 5 *151 (25) 2
As r value is positive, they are positively correlated.
Correlation
Spearman’s correlation coefficient(ranks)
Spearman's rank correlation coefficient or Spearman's rho, is
a measure of statistical dependence between two variables
based on ranks or relative values
Spearman’s correlation coefficient(ranks)
Example
Find Spearman's rank correlation between IQ(X) and hours
spent in TV(Y)
=-0.1757
Correlation
Assumptions
Assumption 1: The correlation coefficient r assumes that the two
variables measured form a bivariate normal distribution
population.
Assumption 2: The correlation coefficient r measures only linear
associations: how nearly the data falls on a straight line. It is
not a good summary of the association if the scatterplot has a
nonlinear (curved) pattern.
Assumption 3: The correlation coefficient r is not a good
summary of association if the data are
heteroscedastic.(when random variables have the same finite
variance. It is also known as homogenity of variance)
Assumption 4: The correlation coefficient r is not a good
summary of association if the data have outliers.
BITS Pilani
Pilani Campus
Regression
Regression
Regression follows correlation in identifying the causal
relationship between the two correlated variables.
The dependence of dependent variable Y on the independent
variable X.
Relationship is summarized by a regression equation.
y = a + bx
a=intercept at y axis
b=regression coefficient
Regression
The line of regression is the line which gives the best estimate
to the value of one variable for any specific value of the other
variable. Thus the line of regression is the line of “best fit” and
is Obtained by the principle of least squares.
This principle consists in minimizing the sum of the squares of
the deviations of the actual values of y from their estimate
values given by the line of best fit
Regression
The principle of least squares.
This principle consists in minimizing the sum of the squares of
the deviations of the actual values of y from their estimate
values given by the line of best fit
Regression
The principle of least squares.
This principle consists in minimizing the sum of the squares of
the deviations of the actual values of y from their estimate
values given by the line of best fit
Regression
Procedure
Step1: Write the normal equations for the regression line
y=mx+c as
y m x nc
xy m x 2
c x
y m x nc 25 15m 5c
xy m x c x 88 55m 15c
2
2*2 7
rc 0.65
7
BITS Pilani
Pilani Campus
The Syllabus
Probability and Statistics
Probability
Random Variable
Probability distributions
Joint Distributions(Bi-variate)
Sampling
Tests of Hypotheses
Estimation
Correlation
Regression
Others:
Chebyshev’s Theorem, Poisson Processes, Simulation,
Central limit theorem
THANKS