Professional Documents
Culture Documents
Laboratory-Confirmed
Country Total Cases Total Deaths
Cases
• Let’s assume that X is a random variable that can only take two
possible values and let’s denote those values 0 and 1. It’s common
practice to associate “success” with 1 and “failure” with 0.
• 𝑃 𝑋=1 =𝑝
cannot be a third option, p+q=1
• Hence 𝑃 𝑋 = 0 = (1 − 𝑝) = 𝑞
e.g. throwing a coin: If you're betting on the sequence, all are equally likely, but if you're betting on the outcome, one is twice as likely
P(X=0) -> T T -> p(0)*p(0) = 0.25 -> q^2
P(X=2) -> H H -> p^2 -> 0.25
P(X=1) -> H T or T H -> p*q + q*p -> 0.5
The Binomial Distribution: a series of Bernoulli trials
x can be 0 to k;
𝑘 𝑥 𝑘−𝑥
𝑝 𝑥 = 𝑝 (1 − 𝑝) , x = 0,1,2,…k
𝑥
𝑘 𝑘!
=
𝑥 𝑥! 𝑘 − 𝑥 !
This is written 𝑋~𝐵 𝑘, 𝑝 and provides a probability model for the total
number of successes in a sequence of k independent Bernoulli trials, in
which the probability of success in a single trial is p.
The Binomial Distribution: understanding the formula
𝑘 𝑥 𝑘−𝑥
𝑝 𝑥 = 𝑝 (1 − 𝑝) , x = 0,1,2,…k
𝑥
• The problem is that there are usually many possible ways of doing this.
The Binomial Distribution: understanding the formula
𝑘 𝑥 𝑘−𝑥
𝑝 𝑥 = 𝑝 (1 − 𝑝) , x = 0,1,2,…k
𝑥
• The second part (highlighted) is therefore the number of possible ways of getting
x successes from k trials.
• It’s called the binomial coefficient and the button on your calculator looks like
this: nCr , where n = k. Just make sure that you know where this button is!
• In R there is a function called choose(k,x).
The Binomial Distribution: properties
• In this case the data were collected by Professor Ben Sheldon and his group. They
have data on the number of extra-pair chicks in broods of size seven.
• The appropriate null model is the binomial distribution (with k = 7) and the
appropriate statistic is the chi-squared.
Great tits in Wytham Woods: assembling observed counts.
Number of EPCs Frequency Expected?
0 14
1 6
2 4
3 2
4 1
5 1
6 0
7 1
Great tits in Wytham Woods: assembling expected counts.
total should be 29
Great tits in Wytham Woods: assembling expected counts.
more variability in the data that expected; males and females are different;
In the observed data there are more zeroes
and more high values than expected.
1 6 11.159
= 10.689
2 4 7.217 rows number of parameters
df = k – n – 1
>2 5 3.229 df = 4 – 1 – 1
=2
Bit naughty as all expected
values should be >5
Great tits in Wytham Woods: significant departure?
Degrees of freedom (df) χ2 value
• Background: large coalitions of males can hold prides for longer. Thus male
reproductive success is highly correlated with the number of same-sex siblings.
This is not the case for females.
• Hypothesis: when litter size is large (4 or more cubs), there should be more litters
than expected with large numbers of male cubs.
• The data: the number of male lion cubs in 34 litters of size four was recorded in
the Serengeti National Park*. The data are: 2 3 3 3 1 3 3 2 3 2 1 4 3 0 0 3 0 0 3 1 3
3332332023211
• Analysis: Use these data to test the distribution of male cubs among litters.
Interpret your results.
* Data from: Packer & Pusey (1987). Intrasexual cooperation and the sex ratio in African lions. American Naturalist. 130: 636-
642.