You are on page 1of 9

11-09-2018

PROBABILITY AND
PROBABILITY
DISTRIBUTIONS

Mutually exclusive and not Mutually


exclusive events
• Mutually exclusive: When one and only one of them can
take place at a time.
• Example: from the deck of 52 cards, probability of drawing an ace
and an even number
• Probability that either A or B will occur
• P(A or B) = P(A) + P(B)
• Not mutually exclusive: Probability of drawing a heart and
a queen
• P(A or B) = P(A) + P(B) – P(AB)
• Collectively exhaustive list: When only a list of possible
outcomes can be there, such as rolling dice with 6 faces

1
11-09-2018

Types of probability
• Classical approach
• Such as probability of getting head = 0.5
• Relative frequency approach
• We repeat experiments or runs
• Proportion of times the event has occurred in the given number of
trials under controlled condition
• More the number of trials more accuracy
• Subjective approach
• Example: its very cloudy, I think it will rain today
• With replacement and without replacement
• Probability of drawing a card = 1/52
• Probability of drawing second card = 1/51, if no replacement has
happened
• Probability of drawing second card = 1/52, if the first card was
replaced

Probabilities under condition of statistical


independence
• When two events happen one event may (dependent) or
may not (independent) have an effect on the outcome of
the second event
• Marginal: occurrence of one event does not alter other event

A B

• Joint: Probability of two or more independent events occurring


together or in succession
• P(AB) = P(A) X P(B) A B

2
11-09-2018

Probabilities under condition of statistical


independence
• Probability of an event is effected by the occurrence of
other
• Conditional probability

• P(B|A) = P(BA)/P(A)

• Joint probability
• P(BA) = P(B|A)*P(A)
• Marginal probability
• P(B) = P(B|A) + P(B|C)

Probability distribution
Probability

• It is like frequency distribution such that


• Listing of probabilities of all outcomes that could
result if experiments were done
Events
• Discrete: a definite list of possible events
• Continuous: any value within a given range
• For discrete values: Sum of all the probabilities within the
defined set = 1
• Example:
• There are six groups in air quality index (India) = Good, Moderate,
unhealthy for sensitive, unhealthy, very unhealthy and hazardous
• Therefore probability that on a day air quality index is good = Pi, such
that
∑Pi = 1

3
11-09-2018

Bernoulli process – binomial distribution


• Binomial distribution describes discrete data resulting from a
Bernoulli process
• When one of the two events can occur
• Example: Today people will travel or not travel
It will rain today or not
• Bernoulli process –
• Each trial has only two possible outcomes
• Probability of outcome of one of the event is fixed every time
• Trials are statistically independent
• Calculating binomial probability of success
• p = probability of success
• q = 1-p = probability of failure
• r = number of successes desired
• n = number of trials undertaken
! ( − )
• Probability of r success in n trials =
! !

Central tendency measure for binomial


distribution
• Mean = p
• Variance = p(1-p)
• Standard deviation = p(1−p)
• Probability that a student in the class of UDM is a girl in a
sample of 10 students
• Mean = Np
• Variance = Np(1-p)
• Standard deviation = p(1−p)
• COV = standard deviation / mean

4
11-09-2018

Poisson distribution
• Distribution of occurrence of an event
• A discrete probability distribution that expresses the
probability of a given number of events occurring in a
fixed interval of time and/or space, if these events occur
with a known average rate and independently of the time
since the last event.
• Example –
• Probability of rainfall in a certain time period in a region
• Probability of road crash on a certain stretch
• Probability that certain number of people will come to a water billing
facility

Poisson distribution
• Properties
• The experiment results in outcomes that can be classified as
successes or failures.
• The average number of successes (μ) that occurs in a specified region
is known.
• The probability that a success will occur is proportional to the size of
the region.
• The probability that a success will occur in an extremely small region is
virtually zero.
• Conditions
• The number of successes in two disjoint time intervals is independent.
• The probability of a success during a small time interval is proportional
to the entire length of the time interval.
• The mean of the Poisson distribution is equal to μ.
• The variance is also equal to μ.

5
11-09-2018

Probability of occurrence of event


P(x;μ)=((e−μ)(μx))/x!
• where:
• e = a constant equal to approximately 2.71828 (actually, e is the
base of the natural logarithm system);
• μ = the mean number of successes that occur in a specified region;
• x: the actual number of successes that occur in a specified region;
• P(x; μ): the Poisson probability that exactly x successes occur in a
Poisson experiment, when the mean number of successes is μ;
and
• x! is the factorial of x.

Continuous distribution
• The distribution is often abbreviated U(a,b),
with a and b being the maximum and minimum values.
• Assumes uniform distribution
• cumulative distribution function: The probability that a
real-valued random variable X with a given probability
distribution will be found at a value less than or equal to x.
• p-value: The probability of obtaining a test statistic at
least as extreme as the one that was actually observed,
assuming that the null hypothesis is true.

6
11-09-2018

Probability distributions
• Normal distribution
• Mean = 0 and standard deviation = 1
• If a data is normally distributed and has mean = 0 and sd = 1 then
we can use probability distribution table

Z-scores
• Problem?
• Not all dataset will have mean = 0 and SD = 1
• For example,
• Mean = 50 and SD = 20
• Then what to do?
• We can standardize the data such that mean = 0 and SD
=1
• Steps
• First we define data point with respect to mean values
• Second, we divide the resulting score by the standard deviation of
the data set such that identified SD = 1
• The resulting score is known as z-scores

7
11-09-2018

Shape of distribution
• Data: probability of occurrence of data point
• Look back to frequency distribution and percentiles
• What is percentile?
• For example, 20% chance of occurrence of a data point
• To generate them we might want to refer to probability
distributions

Z-score and probability distribution table


• For positive score
• Smaller portion (+ve score)
• Percentage of data lying above the defined score
• Larger portion (+ve score)
• Percentage of data lying below the defined score

• For negative score


• Smaller portion (-ve score)
• Percentage of data lying below the defined score
• Larger portion (-ve score)
• Percentage of data lying above the defined score

8
11-09-2018

Estimating percentile scores


• Lets say we have to calculate for 25 percentile
• For a data with mean = 30 and SD = 3
• Then,
• First, determine z-score value for 25% or 0.25 probability
• Second in the previous equation,
• 0.68 = (x-30)/3
• Therefore, x = 27.96 for first quartile
• And x = 32.04 for fourth quartile

You might also like