You are on page 1of 6

Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

Probability functions in R
## Settings for RMarkdown http://yihui.name/knitr/options#chunk_options
opts_chunk$set(comment = "", warning = FALSE, message = FALSE, tidy = FALSE,
echo = TRUE, fig.width = 7, fig.height = 7)
options(width = 116, scipen = 10)

setwd("~/statistics/bio201/")

library(ggplot2)

1 de 6 5/04/2020, 3:31 p. m.
Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

function type What it does -norm (normal distribution) -binom (binomial distribution)
d- probability density (mass): y value given x value dnorm dbinom
p- probability given x value pnorm pbinom
q- x value given probability qnorm qbinom

2 de 6 5/04/2020, 3:31 p. m.
Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

Probability density functions (d-): Return y value given x value


Normal distribution

## Density at 90 in a normal distribution with a mean of 124 and sd of 20


dnorm(x = 90, mean = 124, sd = 20)

[1] 0.004702

## Density at c(10,20,30) in a normal distribution with a mean of 124 and sd of 20


dnorm(x = c(10,20,30), mean = 124, sd = 20)

[1] 0.000000001757 0.000000026805 0.000000318491

## Graphing the distribution: Do not give a value to x, then wrap with curve() function
curve(dnorm(x, mean = 124, sd = 20), xlim = c(0, 200))
abline(h = 0)

Binomial distribution

## Density (mass) at 3 in a binomial distribution with 5 trials with success probability of 0.4
dbinom(x = 3, size = 5, prob = 0.4)

[1] 0.2304

## Density (mass) at c(1,2,3) in a binomial distribution with 5 trials with success probability of 0.4
dbinom(x = c(1,2,3), size = 5, prob = 0.4)

[1] 0.2592 0.3456 0.2304

## Graphing the distribution: Use barplot() as it is a discrete distribution. Give 0:5 (= c(0,1,2,3,4,5))
barplot(dbinom(x = 0:5, size = 5, prob = 0.4), names.arg = 0:5)

Probability function (p-) and Quantile function (q-)


Probability function (p-): Given an x value, it returns the probability (AUC) of having a value lower than x.
Quantile function (q-): Given a probability (AUC), it returns the x value at the upper boundary.
These two are doing the opposite actions.

Normal distribution

## Probability of having a value *lower* than 90 in a normal distribution with a mean of 124 and sd of 20.
pnorm(q = 90, mean = 124, sd = 20, lower.tail = TRUE)

[1] 0.04457

3 de 6 5/04/2020, 3:31 p. m.
Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

curve(dnorm(x, mean = 124, sd = 20), xlim = c(0, 200))


abline(h = 0)
sequence <- seq(0, 90, 0.1)
polygon(x = c(sequence,90,0),
y = c(dnorm(c(sequence),124,20),0,0),
col = "grey")

## Probability of having a value *higher* than 140 in a normal distribution with a mean of 124 and sd of 20.
1 - pnorm(q = 140, mean = 124, sd = 20, lower.tail = TRUE)

[1] 0.2119

pnorm(q = 140, mean = 124, sd = 20, lower.tail = FALSE)

[1] 0.2119

curve(dnorm(x, mean = 124, sd = 20), xlim = c(0, 200))


abline(h = 0)
sequence <- seq(140, 200, 0.1)
polygon(x = c(sequence,200,140),
y = c(dnorm(c(sequence),124,20),0,0),
col = "grey")

## X-axis value *below* which the probability (AUC) is 10%


qnorm(p = 0.10, mean = 124, sd = 20, lower.tail = TRUE)

[1] 98.37

curve(dnorm(x, mean = 124, sd = 20), xlim = c(0, 200))


abline(h = 0)
sequence <- seq(0, 98.36897, 0.1)
polygon(x = c(sequence,98.36897,0),
y = c(dnorm(c(sequence),124,20),0,0),
col = "grey")

4 de 6 5/04/2020, 3:31 p. m.
Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

## X-axis value *above* which the probability (AUC) is 20%


qnorm(p = 0.10, mean = 124, sd = 20, lower.tail = FALSE)

[1] 149.6

curve(dnorm(x, mean = 124, sd = 20), xlim = c(0, 200))


abline(h = 0)
sequence <- seq(149.631, 200, 0.1)
polygon(x = c(sequence,200,149.631),
y = c(dnorm(c(sequence),124,20),0,0),
col = "grey")

Binomial distribution
Definitions are confusing. It is probably safer to use summation of dbinom.

## Probability of having a value *lower* than or equal to 3 in a binomial distribution with 5 trials with success probability of 0.4. P[X <= x]. Include the "equal to" value.
pbinom(q = 3, size = 5, prob = 0.4, lower.tail = TRUE)

[1] 0.913

barplot(dbinom(x = 0:5, size = 5, prob = 0.4), names.arg = 0:5, col = rep(c("grey","white"), c(4,2)))

## Probability of having a value *higher* than 3 in a binomial distribution with 5 trials with success probability of 0.4. P[X > x]. Be careful it excludes the probability at 3.
pbinom(q = 3, size = 5, prob = 0.4, lower.tail = FALSE)

[1] 0.08704

barplot(dbinom(x = 0:5, size = 5, prob = 0.4), names.arg = 0:5, col = rep(c("white","grey"), c(4,2)))

5 de 6 5/04/2020, 3:31 p. m.
Probability functions in R http://rstudio-pubs-static.s3.amazonaws.com/1790_fa129fa38dbe4cd18d...

## Summation of individual probabilities may be safer.


sum(dbinom(x = 3:5, size = 5, prob = 0.4))

[1] 0.3174

## X-axis value *below (including the value)* which the summation of probabilities is greater than or equal to p.
qbinom(p = 0.3, size = 5, prob = 0.4, lower.tail = TRUE)

[1] 1

barplot(dbinom(x = 0:5, size = 5, prob = 0.4), names.arg = 0:5, col = rep(c("grey","white"), c(2,4)))

## Summation of individual probabilities may be safer.


cumsum(dbinom(x = 0:5, size = 5, prob = 0.4))

[1] 0.07776 0.33696 0.68256 0.91296 0.98976 1.00000

## Summation in the opposite direction.


cumsum(rev(dbinom(x = 0:5, size = 5, prob = 0.4)))

[1] 0.01024 0.08704 0.31744 0.66304 0.92224 1.00000

6 de 6 5/04/2020, 3:31 p. m.

You might also like