The Gamma Distribution

The Gamma Distribution
The gamma distribution is a continuous probability distribution that is popular for a range of

phylogenetic applications. The gamma distribution is popular in part because its a bit of a shape
shifter that can assume a range of shapes, from exponential to normal. This flexibility results from
the fact that gamma distribution has two parameters. In most phylogenetic applications, these
parameters are referred to as the shape parameter, , and the rate parameter, , (note that some
applications of the rate parameter, , parameter is replaced by the “scale parameter,” which is
simply the inverse of the rate parameter).
You can explore the impact of these two parameters on the gamma distribution using the following
R scripts (note that this script focuses on plotting values that flank the default parameter settings for
the gamma distribution in the program SIMMAP 1.5):
Using the R script below, you can generate probability densities

for the gamma distribution that vary the value of the shape
parameter ( ) (Fig. 1). As you can see, varying has a strong
impact on the shape of the gamma distribution. The gamma
distribution is the sum of independent and identically
distributed (i.i.d.) exponential distributions (i.e., that have the
same rate parameter). Accordingly, when = 1, the gamma
collapses to an exponential distribution, when >> 1, the
gamma distribution increasingly resembles a normal distribution.
Figure 1: The impact of varying the

shape parameter (alpha) on the
gamma distribution.
#Generate a plot of gamma distributions that vary the shape parameter (alpha).
x <- seq(0, 100, length=200)
simmapDefaultGamma <- dgamma(x, shape=1.25, scale=1/0.25) #Make probability density function

for SIMMAP default gamma distribution
plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,0.16), xlim=c(0,100), xlab="x

value", ylab="Density", main="Probability density for gamma distribution with variable alpha and
beta=0.25", lwd=0)
colors <- c("red", "black", "blue", "darkgreen", "purple", "orange")
alphas <- c(0.1, 1.25, 2, 4, 8, 10)
labels <- c("alpha=0.1", "alpha=1.25 (SIMMAP default)", "alpha=2", "alpha=4", "alpha=8",

"alpha=10")
for(i in 1:length(alphas)) {
hx <- dgamma(x, shape=alphas[i], rate=1, scale=1/0.25)
lines(x, hx, lwd=3, col=colors[i])
legend("topright", inset=.05, title="Probability densities",
labels, lwd=3, col=colors)
Using the R script below, you can visualize the

impact of varying the rate parameter ( ) while
keeping the shape parameter ( ) constant (Fig.
2). has a strong impact on the shape of the
gamma distribution. When is set to less than 1,
we tend to observe relatively broad distributions
with long tails. As we increase the value of , we
observe increasingly tight distributions. This effect
stems from the fact that the variance of the
gamma is / .
Figure 2: The impact of varying the rate parameter

(beta) on the gamma distribution.
#Generate a plot of gamma distributions that vary the rate parameter (beta) as in Figure 2 below.
x <- seq(0, 100, length=200)

#plot(x, simmapDefaultGamma, type="l")
plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,0.9), xlim=c(0,70), xlab="x

value", ylab="Density", main="Probability density for gamma distribution with a fixed alpha=1.25
and variable beta", lwd=2)
betas <- c(0.1, 0.25, 2, 4, 8, 10)
labels <- c("beta=0.1", "beta=0.25 (SIMMAP default)", "beta=2", "beta=4", "beta=8", "beta=10")
for(i in 1:length(betas)) {
hx <- dgamma(x, shape=1.25, rate=1, scale=1/betas[i])
legend("topright", inset=.05, title="Probability densities",labels, lwd=2, col=colors)
For many phylogenetic applications of the

gamma distribution -- e.g, to accommodate variation in
substitution rate across sites (ASRV) -- the and
parameters are constrained to be equal. You can visualize
the gamma distributions generated under these conditions
using the R script below (Fig. 3). Because the mean of the
gamma distribution is , this constraint ensures that the
gamma distribution has a mean of one. This is important
when the gamma distribution is used as a prior probability
density on ASRV, as it retains the ability to interpret branch
lengths as the expected (mean) number of substitutions per
site.
Figure 3: Gamma distributions with

alpha and beta set equal to one
another.
#Generate a plot of gamma distributions with alpha and beta equal to one another as in Figure 3
below.
x <- seq(0, 100, length=200)

#plot(x, simmapDefaultGamma, type="l")
plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,2), xlim=c(0,30), xlab="x value",

ylab="Density", main="Probability density for gamma distribution with a alpha and beta equal to one
another", lwd=2)
alphas <- c(0.1, 0.5, 1, 5, 20)
betas <- c(0.1, 0.5, 1, 5, 20)
labels <- c("alpha=0.1, beta=0.1", "alpha=1.25, beta=0.25 (SIMMAP default)", "alpha=0.5, beta=0.5",
"alpha=1, beta=1", "alpha=5, beta=5", "alpha=20, beta=20")
for(i in 1:length(betas)) {
hx <- dgamma(x, shape=alphas[i], rate=1, scale=1/betas[i])
legend("topright", inset=.05, title="Probability densities",labels, lwd=2, col=colors)

The Gamma Distribution

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Gamma Distribution

Uploaded by

Copyright:

Available Formats

The Gamma Distribution

The gamma distribution is a continuous probability distribution that is popular for a range of

Using the R script below, you can generate probability densities

Figure 1: The impact of varying the

x <- seq(0, 100, length=200)

simmapDefaultGamma <- dgamma(x, shape=1.25, scale=1/0.25) #Make probability density function

plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,0.16), xlim=c(0,100), xlab="x

colors <- c("red", "black", "blue", "darkgreen", "purple", "orange")

alphas <- c(0.1, 1.25, 2, 4, 8, 10)

labels <- c("alpha=0.1", "alpha=1.25 (SIMMAP default)", "alpha=2", "alpha=4", "alpha=8",

lines(x, hx, lwd=3, col=colors[i])

legend("topright", inset=.05, title="Probability densities",

labels, lwd=3, col=colors)

Using the R script below, you can visualize the

Figure 2: The impact of varying the rate parameter

x <- seq(0, 100, length=200)

simmapDefaultGamma <- dgamma(x, shape=1.25, scale=1/0.25) #Make probability density function

#plot(x, simmapDefaultGamma, type="l")

plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,0.9), xlim=c(0,70), xlab="x

colors <- c("red", "black", "blue", "darkgreen", "purple", "orange")

betas <- c(0.1, 0.25, 2, 4, 8, 10)

lines(x, hx, lwd=2, col=colors[i])

legend("topright", inset=.05, title="Probability densities",labels, lwd=2, col=colors)

For many phylogenetic applications of the

Figure 3: Gamma distributions with

x <- seq(0, 100, length=200)

simmapDefaultGamma <- dgamma(x, shape=1.25, scale=1/0.25) #Make probability density function

#plot(x, simmapDefaultGamma, type="l")

plot(x, simmapDefaultGamma, type="l", yaxs="i", xaxs="i", ylim=c(0,2), xlim=c(0,30), xlab="x value",

colors <- c("red", "black", "blue", "darkgreen", "purple", "orange")

alphas <- c(0.1, 0.5, 1, 5, 20)

betas <- c(0.1, 0.5, 1, 5, 20)

hx <- dgamma(x, shape=alphas[i], rate=1, scale=1/betas[i])

lines(x, hx, lwd=2, col=colors[i])

legend("topright", inset=.05, title="Probability densities",labels, lwd=2, col=colors)

You might also like