You are on page 1of 17

Examples of discrete probability

distributions

The Binomial and Poisson Distributions


Binomial Probability Distribution
◎ A fixed number of observations (trials), n
○ e.g., 15 tosses of a coin; 20 patients; 1000 people
surveyed
◎ A binary random variable
○ e.g., head or tail in each toss of a coin; infected or not
not infected
○ Generally called “success” and “failure”
○ Probability of success is p, probability of failure is 1 –
p
◎ Constant probability for each observation
○ e.g., Probability of getting a tail is the same each time
we toss the coin

2
3
4
Binomial Distribution
Mean & S.D.

Mean = p
s.d = square root of [p(1-p)]

5
Binomial Distribution
Application Example

Cohort study (or cross-sectional):


○ The number of exposed individuals in
your sample that develop the disease
○ The number of unexposed individuals in
your sample that develop the disease
Case-control study:
○ The number of cases that have had the
exposure
○ The number of controls that have had
the exposure

6
POISSON PROBABILITY DISTRIBUTION

◎ Poisson distribution is for counts


—if events happen at a constant
rate over time, the Poisson
distribution gives the probability
of X number of events occurring
in time T.

7
Poisson Mean and Variance

8
POISSON Probability
Application Example

The Poisson distribution models counts,


such as the number of new cases of
vehicular accidents that occur in an
intersection next month.
The distribution tells you the
probability of all possible numbers of new
cases, from 0 to infinity.
If X= # of new cases next month and X
~ Poisson (), then the probability that X=k
(a particular count) is:

k e  
p( X  k ) 
k!

9
Example of a Poisson Probability Table
λ= 2

10
NOTE

“Poisson Process” (rates)


Note that the Poisson parameter  can be given as
the mean number of events that occur in a defined
time period OR, equivalently,  can be given as a
rate, such as =2/month (2 events per 1 month)
that must be multiplied by t=time (called a “Poisson
Process”) 
X ~ Poisson ()
(t ) k e  t
P( X  k ) 
k!

11
12
Chi-Square Test of Independence

Chi-Square (X2) is a statistical test used to


determine whether your experimentally observed
results are consistent with your hypothesis.
Test statistics measure the agreement between
actual counts and expected counts assuming the
null hypothesis. It is a non-parametric test.
The chi-square test of independence can be used
for any variable; the group (independent) and the
test variable (dependent) can be nominal,
dichotomous, ordinal, or grouped interval.
13
Chi-Square Limits and Problems

◎ Implying cause rather than association


◎ Overestimating the importance of a
finding, especially with large sample
sizes
◎ Failure to recognize spurious
relationships
◎ Nominal variables only (both IV and
DV)
14
Chi-Square Attributes

 A chi-square analysis is not used to prove a


hypothesis; it can, however, refute one.
 As the chi-square value increases, the
probability that the experimental outcome could
occur by random chance decreases.
 The results of a chi-square analysis tell you:
Whether the difference between what you
observe and the level of difference is due to
sampling error.
 The greater the deviation of what we observe to
what we would expect by chance, the greater
the probability that the difference is NOT due to
chance. 15
Critical Chi-square Values

◎ Critical values for chi-square are found on


tables, sorted by degrees of freedom and
probability levels. Be sure to use p < 0.05.
◎ If your calculated chi-square value is greater
than the critical value calculated,
you“reject the null hypothesis.”
◎ If your chi-square value is less than the
critical value, you“fail to reject” the null
hypothesis

16
CHI-SQUARE TEST

Normally requires sufficiently large sample size:


○ In general N > 20.
○ No one accepted cutoff – the general rules are
◉ No cells with observed frequency = 0
◉ No cells with the expected frequency < 5
◉ Applying chi-square to very small samples
exposes the researcher to an unacceptable
rate of Type II errors.
Note: chi-square must be calculated on actual
count data, not substituting percentages, which
would have the effect of pretending the sample
size is 100.

17

You might also like