You are on page 1of 10

Probability Mass Functions for Bernoulli and Binomial Distribution

Binomial distribution in R is a probability distribution used in statistics.


for a discrete random variable and a realization X = x, the binomial
mass function f is
R Functions for Binomial Distribution The built-in functions dbinom
( ), pbinom( ), qbinom( ), and rbinom() are all relevant to the binomial
and Bernoulli distributions.
1. The dbinom( ) function directly provides the mass function
probabilities Pr(X = x) for any valid x, that is, 0 ≤ x ≤ n.
*This function is used to find probability at a particular value for a data
that follows binomial distribution i.e. it finds: P(X = k)
Syntax: dbinom(k, n, p) where n is total number of trials, p is
probability of success, k is the value at which the probability has to be
found out.
EX: Result of Pr(X = 5) for the die-roll example
dbinom(x=5,size=8,prob=1/6) Output: 0.004167619
2.The pbinom( ) Function is used to find the cumulative probability of a
data following binomial distribution till a given value ie it finds P(X ≤ k).
Syntax: pbinom(k, n, p) where n is total number of trials, p is
probability of success, k is the value at which the probability has to be
found out.
EX: the probability that you observe three or fewer 4s, Pr(X ≤ 3), you
either sum the relevant individual entries from dbinom as earlier or
use pbinom.
pbinom(q=3,size=8,prob=1/6) Output: 0.9693436
3. The qbinom() Function: Less frequently used is the qbinom function,
which is the inverse of pbinom( ).Where pbinom( ) provides a
cumulative probability when given a quantile value q, the function
qbinom( ) provides a quantile value when given a cumulative
probability p.
Syntax: qbinom(P, n, p) where n is total number of trials, p is
probability of success, k is the value at which the probability has to be
found out.
EX: qbinom(p=0.95,size=8,prob=1/6) Output: 3
4. The rbinom( ) Function the random generation of realizations of a
binomially distributed variable is retrieved using the rbinom function.
Syntax: rbinom(n, N, p) Where n is numbers of observations, N is the
total number of trials, p is the probability of success.
EX : rbinom(8, size = 13, prob = 1 / 6) Output: 2 3 4 4 1 3 3 5
Probability Density Functions Uniform and Normal Distribution
Uniform: The continuous uniform distribution is also referred to as the
probability distribution of any random number selection from the
continuous interval defined between intervals a and b.
R Functions for Uniform Distribution
1. The dunif( ) Function: You can use the built-in d-function for the
uniform distribution, dunif, to return these heights for any value
within the defined interval.
• dunif() method in R programming language is used to generate
density function. It calculates theuniform density function in R
language in the specified interval (a, b).
• Syntax: dunif(x, min = 0, max = 1, log = FALSE)
The result produced will be for each value of the interval. Hence, a
sequence will be generated.
• Example :dunif(x=c(-2,-0.33,0,0.5,1.05,1.2),min=-0.4,max=1.1)
Output: 0.000 0.6667 0.6667 0.6667 0.6667 0.0000
2.The punif( ) Function in R is used to calculate the uniform cumulative
distribution function, this is, the probability of a variable X taking a
value lower than x (that is, x <= X).
• If we need to compute a value x > X, we can calculate 1 – punif(x).
• Syntax: punif(q, min = 0, max = 1, lower.tail = TRUE)
• Example: punif (15 , min =min , max = max)
Output: 0.25
3. The qunif( ) Function:By a quantile, we mean the fraction (or
percent) of points below the given value.
• qunif() method is used to calculate the corresponding quantile for
any probability (p) for a given
uniform distribution. To use this simply the function had to be called
with the required
parameters.
• The qunif function is the inverse of punif:
• Syntax: qunif(p, min = 0, max = 1)
• Example: qunif(0.2, min = min, max = max) Output: 12
4. The runif( ) Function:
• The runif() function in R programming language is used to generate a
sequence of random following the uniform distribution.
• Syntax: runif(n, min = 0, max = 1)
• Example runif(15, min=1, max=3) Output: 10
sampling distribution A sampling distribution is just like any other
probability distribution, but it is specifically associated with
a random variable that is a sample statistic. Probability distribution
associated with our statistic is called its sampling distribution
Hypotheses: A hypothesis is an assumption made by the researchers
that are not mandatory true. In simple words, a
hypothesis is a decision taken based on the data of the population
collected for any experiment. It is not
mandatory for this assumption to be true every time.
Null Hypothesis (H0): The null hypothesis is interpreted as the
baseline or nochange hypothesis and is the claim that is
assumed to be true. The null hypothesis is often (but not always)
defined as an equality, =, to a null value.
Alternative Hypothesis (Ha or H1): The alternative hypothesis is the
conjecture that you’re testing for, against the null hypothesis. The
alternative hypothesis is the complement to the null hypothesis.
The alternative hypothesis (the situation you’re testing for) is often
defined in terms of an inequality to the null value.
Types of Errors in Hypothesis testing
In Hypothesis testing, Error is the estimate of the approval or rejection
of a particular hypothesis. There are mainly two types of errors in
hypothesis testing There are two types of errors that can arise in
hypothesis testing: Type I error and Type II error.
The Four Key steps involved are
• State the Hypotheses, form our null hypothesis and alternative
hypothesis.
Null Hypothesis (H0): This is a statement of no effect or no difference in
the population or data samples.
Alternative Hypothesis (H1 or Ha): This is a statement that there is an
effect or a difference in the population.
• Formulate an analysis plan and set the criteria for decision(Set our
significance level). The significance level varies depending on our use
case,
but the default value is 0.05.
• Calculate the Test statistic and P-value. Perform a statistical test that
suits
our data. The probability is known as the p-value.
• Check the resulting p-Value and Make a Decision. If the p-Value is
smaller than our significance level, then we reject the null hypothesis in
favour of our alternative hypothesis. If the p-Value is higher than our
significance level, then we go with our null hypothesis.

Components of a Hypothesis Test


Hypotheses:
• hypothesis is a decision taken based on the data of the population
collected for any experiment. It is not mandatory for this assumption to
be true every time.
The two primary types of hypotheses in statistics are:
Null Hypothesis (H0): The null hypothesis is interpreted as the baseline
or nochange hypothesis and is the claim that is assumed to be true.
The null hypothesis is often (but not always) defined as an equality, =,
to a null value.
Alternative Hypothesis (Ha or H1): The alternative hypothesis is the
conjecture that you’re testing for, against the null hypothesis. The
alternative hypothesis is the complement to the null hypothesis.
2. Test statistic
• Once the hypotheses are formed, sample data are collected, and
statistics are calculated according to the parameters detailed in the
hypotheses.
• The test statistic is the statistic that is compared to the appropriate
standardized sampling distribution to yield the p-value.
• Specifically, the test statistic is determined by both the difference
between the original sample statistic and the null value and the
standard error of the sample statistic

3. P-value:
• The p-value is the probability value that is used to quantify the
amount of evidence, if any, against the null hypothesis. More formally,
the p-value is found to be the probability of observing the test statistic,
or something more extreme, assuming the null hypothesis is true.
• Put simply, the more extreme the test statistic, the smaller the p-
value. The smaller the p-value, the greater the amount of statistical
evidence against the assumed truth of H0 .
4. Significance Level:
For every hypothesis test, a significance level, denoted α, is assumed.
This is used to qualify the result of the test. The significance level
defines a cut off point, at which you decide whether there is sufficient
evidence to view Ho as incorrect and favour HA instead.
• If the p-value is greater than or equal to α, then you conclude there is
insufficient evidence against the null hypothesis, and therefore you
retain Ho when compared to HA.
• If the p-value is less than α, then the result of the test is statistically
significant. This implies there is sufficient evidence against the null
hypothesis, and therefore you reject Ho in favour of HA. .
5. Decision rule: Based on the p-value and the predetermined level of
significance, the decision rule determines whether to reject the null
hypothesis. If the p-value is smaller than the alpha level, the null
hypothesis is rejected in favor of the alternative hypothesis.
6. Conclusion: Based on the decision rule and analysis, a conclusion is
drawn regarding the null hypothesis. It states whether there is enough
evidence to support the alternative hypothesis or if there is insufficient
evidence to reject the null hypothesis.
1. Type I Error (False Positive):
• Type I error occurs when we reject the Null hypothesis but the Null
hypothesis is correct /actually true. This case is also known as a false
positive. If your p-value is less than α, you reject
the null statement. If the null is really true, though, the α directly
defines the probability that you incorrectly reject it. This is referred to
as a Type I error.
• Symbolically, it is denoted as α (alpha). it is the significance level or
the probability of making a Type I error. It's set before conducting the
test and represents the maximum allowable probability of rejecting the
null hypothesis when it is true.
• Example: Suppose a pharmaceutical company is testing a new drug's
effectiveness. The null
hypothesis (H0) states that the drug has no effect. A Type I error would
occur if the company
concludes that the drug is effective (rejects the null hypothesis) when,
in reality, it has no effect.
2. Type II Error (False Negative):
• Type II error occurs when we fail to remove the Null Hypothesis when
the Null hypothesis is incorrect/the alternative hypothesis is correct.
This case is also known as a false negative.
Ho = NULL Hypothesis Ha = Alternative Hypothesis
• Mathematical Definition of Type II Error:
P(Probability of failing to remove Ho / Probability of Ho being false ) =
P(Accept Ho | Ho False)
• Type II error occurs when you fail to reject a null hypothesis that is
actually false. It represents the situation where you conclude that
there is no significant effect or difference when, in reality, there is an
effect or difference.
• Symbolically, it is denoted as α (alpha). it is the probability of making
a Type II error. The power of a statistical test is equal to 1 - β and
measures the test's ability to detect an effect when it exists.
• Example: Continuing with the drug example, let's say the drug
actually has a beneficial effect. A Type II error would occur if the
pharmaceutical company fails to conclude that the drug is effective
(fails to reject the null hypothesis) when, in fact, it does have a positive
effect.

You might also like