You are on page 1of 39

Binomial

Distribution
Presented by:
Introduction

• It was introduced in a journal that the binomial distribution is a form of


probability distribution in which it summarizes the likeness that a
particular amount that will get one of two independent values under a
condition or set of parameters or assumptions. There are such
assumptions on the binomial distribution that there is only one outcome
for each trial and therefore each of these trials is mutually exclusive or
independent of one another. In statistics, the binomial distribution is one
of the commonest discrete distributions different from a continuous
distribution, like the normal distribution. The
Symmetry Property of Binomial distribution

• It has been discussed in an article released by Stattrek (2022) that the


mean of the binomial distribution is (p) and its standard deviation is
represented in sqr (p(1-p)/n). The shape of a binomial distribution is said
to be symmetrical when p=0.5 or when the (n) is large.
Mean, the Variance, and the Standard Deviation

• To determine the mean of the binomial distribution, it is important to


understand that the mean of the distribution (μx) is equal to n*P, then the
variance (σ2x) is n*P*(1-P), and the standard deviation (σx) is computed
by sqrt [n *P * ( 1 - P ) ] (StatsDirect, 2021).
Simple Example of Binomial Distribution

• In a study by Corporate Finance Institute (2020) it has been discussed


that there are different samples of trials of the binomial distribution, first is
the fixed trials in which an example can be the coin flips where the
number of times that a trial is performed, is recorded since the start. So, if
a coin is said to be flipped 10 times, each flip of this coin has termed a
trial. Second, is the independent trials where an example is tossing a coin
or a dice roll, and in the event of tossing the coin, the first event is said to
be independent of the subsequent events. Third, the fixed probability of
success which was given an example that when a person tosses a coin
then the probability of getting a head is just about 0.5, and if there are
about 50 trials, the expected value of the number of heads is 25 (50x0.5).
Exercise 66 - Introduction

• The problem discusses the many companies that use a quality control
called acceptance sampling to monitor incoming shipments of different
parts, raw materials, and so on. In the industry of electronics, the parts
are commonly shipped from suppliers in large lots. Inspection of a sample
of the components represented by n that can be seen as the n trials of a
binomial experiment. The outcome that is expected for each component
tested (trial) will be the judgment of the component if it is classified as
good or defective. The company of Reynolds Electronics accepts a lot
from a particular supplier if the defective components in the lot do not
exceed 1%. In the problem, there are supposed five items that are
recorded as the samples from a recent shipment.
Problem

• Assume that 1% of the shipment is defective. Compute the probability


that no items in the sample are defective.
(Probability of Defective) P (D) = 1% = 0.01%
(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4
(0 defective) (1 defective)
P (A) = 0.959 = 95.9%
• Assume that 1% of the shipment is defective. Compute the probability
that exactly one item in the sample is defective.
(Probability of Defective) P (D) = 1% = 0.01%
(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4
(0 defective) (1 defective)
P (A) = 0.998 = 99.8%
Problem

• What is the probability of observing one or more defective items in the


sample if 1% of the shipment is defective?

(Probability of Defective) P (D) = 1% = 0.01%


(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4 + 5c2(0.01)2(0.99)3
(0 defective) (1 defective) (2 defective)

P (A) = 0.998 = 99.8% (1 defective)


P (A) = 1,017 = 101.7% (2 defectives)
Problem

• Would you feel comfortable accepting the shipment if one item was found
to be defective? Why or why not?

It’s never easy to determine the sureness of the safety of the products during
the shipment process but when it comes to the issue of defectiveness, it’s more
likely serious than any problems upon receiving a package or a parcel and
therefore it will be very uncomfortable to receive such defective item. When you
pay for a particular product, you also pay for the usage of packaging and other
types of protective cover to the product during the shipment and delivery process
such as the bubble wrap and of course, it is the responsibility of the seller to check
the products and of course, see if it is working or not before packaging, shipment,
and delivery.
Sign Nonparametric Test: Introduction

• The term sign test has been explained to be a non-parametric test that is
utilized to check whether or not two groups of data are equally sized. The test is
used when there are found dependent samples that are ordered in pairs, where
the bivariate random variables are independent in their ways. The sign test is
based on the direction of the plus and the minus sign and not on the magnitude
of its numbers or the numerical magnitude. It is also termed the binomial sign
test that has p = 0.5… the sign test is described by statisticians as a weaker
test compared to other tests due to its procedure of testing the pair value below
or above the median and does not necessarily compute the pair difference
(Statistics Solution, 2021)
Objective and Hypothesis

• In the first type of sign test, the one sample, the hypothesis is made through the
data sample of the problem being shown which targets the + and - signs as the
values of the random variables having equal size. However, on the paired
sample, this is explained as an alternative to the paired t-test and this uses the
+ and - signs in the paired sample tests or the before-after study presented. The
null hypothesis is being made up so that the signs of + and - are equal in size or
the population means are equal to the given sample mean (Statistics Solution,
2021)
Non-Directional and Directional

• In the journal released by Zach (2021), it has been explained that the directional
hypothesis is also known as an alternative hypothesis that contains less than
(represented by the sign “<”) or those greater than (represented by the sign
“>”). however, when it comes to the non-directional hypothesis, this is an
alternative hypothesis containing the not equal (represented by the sign “≠”).
Problem

• A survey of 40 home prices in a metropolitan area has the following results:


Price Less than $200,000 Equal to $200,000 More than $200,000
Number of Homes 13 1 27
• Test the hypothesis that the median price in the metropolitan area is $200,000.

x̄ 1= 13
x̄ 2= 1
x̄ 3= 27

• Above are the values of the mean of every number of homes given in the table.
The symbol x̄ represents the mean of every data in each column provided.
Problem

• Ho = (the median price in the metropolitan area is $200,000)


• Ha= (the median price in the metropolitan area is not $200,000)
• Test Statistic: t-test

• Since this is a t-test and there are multiple samples, the groups are divided
into three parts which are groups 1& 2, groups 1&3, and groups 2 & 3.
Moreover, the alpha which is 0.05 will also be divided into 3 since the groups for
the t-test are subdivided into 3, the alpha is now 0.0166.
Conclusion

• The first group had yielded a P(T<=t) one-tail of 0.0285 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 2 which are those less than and equal to $200,000.
• The second group had yielded a P(T<=t) one-tail of 0.3037 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 3 which are those less than and more than $200,000.
• The third group had yielded a P(T<=t) one-tail of 0.1878 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 2 & 3 which are those equal and more than $200,000.
• Since the group equal to $200,00 yields the middle amount for the data, this
means that although there is no significant difference in the p-value of these
groups, it is also concluded that the median is equal to $200,000.
Problem

• Test the hypothesis that the median price in the metropolitan area is more than
$200,000
x̄ 1= 13
x̄ 2= 1
x̄ 3= 27
• Above are the values of the mean of every number of homes given in the table.
The symbol x̄ represents the mean of every data in each column provided.
• Ho = (the median price in the metropolitan area is more than $200,000)
• Ha= (the median price in the metropolitan area is not more than $200,000)
• Test Statistic: t-test
• Since this is a t-test and there are multiple samples, the groups are divided
into three parts which are groups 1& 2, groups 1&3, and groups 2 & 3.
Moreover, the alpha which is 0.05 will also be divided into 3 since the groups for
the t-test are subdivided into 3, the alpha is now 0.0166.
Conclusion

• The first group had yielded a P(T<=t) one-tail of 0.0285 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 2 which are those less than and equal to $200,000.
• The second group had yielded a P(T<=t) one-tail of 0.3037 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 3 which are those less than and more than $200,000.
• The third group had yielded a P(T<=t) one-tail of 0.1878 and since this is not
less than 0.0166, then there is no significant difference in the responses of the
groups 2 & 3 which are those equal and more than $200,000.
• Since the group 2 & 3 which represents values equal and more than $200,00
yields the P(T<=t) of 0.1878, this means that since there is no significant
difference in the p-value of these groups, it can be concluded that the median
value is more than $200,000.
• ɑ = 0.05 in both cases. Include hypothesis formulation, use the binomial approach, and explain your
calculations in detail. Include p values in both cases.
Regression: Introduction
Regression: Introduction

• This data shows the historical performance of the S&P Global which is termed
as the division of S&P Dow Jones Indices, representing the set of data they
have gathered from the year 2012 up to the most recent data recorded this
year, 2022. To give an overview, the S&P 500 Index, or Standard & Poor's 500
Index, is a global capitalization-weighted index of 500 of the country's most
prominent quoted companies. Since the index contains additional criteria, it is
not an accurate list of the top 500 U.S. firms by market cap.
Regression: Introduction
Regression: Introduction

• This table shows the significant changes and consistencies in the year
performances of the S&P. The data shown includes the total return which is
explained by Lavrakas (2021) as a way of estimating all gains from such a
transaction by taking into account both price appreciation and income creation
over a certain period, usually a year. The
Regression: Introduction

40

30

20

10

0
2010 2012 2014 2016 2018 2020 2022

-10

-20

Total Return Price Return Net Total Return

• The scatter plot has been prepared to show the relations between the data from
the collected year performances from the year 2012 to 2020.
Regression: Introduction
40

30

20

10

0
2010 2012 2014 2016 2018 2020 2022

-10

-20

Total Return Price Return Net Total Return

• This is a type of linear graph in which data are also seen and are more
emphasized when it comes to its increasing and decreasing plots.
Regression: Discussion

• To perform the successful computation of the regression analysis, several


methods have been used such as the Google Sheets and the Add ons focusing
on statistical computations. Below are the summaries of the computed
regression analysis, scatterplot graphing, extent of residuals, and normality of
residuals.
Regression: Summary Output
Regression: ANOVA & Intercept
Regression: Residual Output & Probability Output
Regression: Residual Output & Probability Output

• This data represented had shown the results from the computation of the
valuable points in the regression analysis such as the regression statistics, the
ANOVA, the Coefficients, Standard error, t stat, P-value, residual outputs, and
the probability outputs which then proves the null hypothesis that there is a
significant change in the collected yearly data of S&P since the P-value of the
data represents a total of more than the alpha of 0.05 (as standard) which
proves that the data is changing over time (Analytics Vidhya, 2021).
Regression: Residual Plot
Regression: Residual Plot

• The residual plot is used as a graph in which the residuals are displayed on the
vertical axis and the independent variable is displayed on the horizontal axis. A
linear regression model is suitable for this research if the dots in a residual plot
are randomly distributed across the horizontal axis; otherwise, a nonlinear
model seems to be more suited (Stattrek, n.d.).
Regression: Normal Probability Plot
Regression: Normal Probability Plot

• A normal probability plot is a graphical tool for determining if a data set is


roughly uniformly distributed. The data is shown against a theoretical normal
distribution with the points forming an approximately horizontal line. Deviations
from the straight line show deviations from the norm (National Institute of
Standards & Technology, n.d.).
Regression Analysis

• Regression analysis indeed helps with the computation of values such as the p-
value and the coefficients computed for the plotting of data, however, another
mode of computation might also be suitable such as a t-test. The influence of
the data is positive in a way that it can be visited by almost all the people as
long as they search for it on the web, however, it is also an adverse situation to
be part of the people involved specifically when there are these computed low
returns and that this is one of the challenges in the field of trading and other
financial assets.
Conclusion

• Since the data from the graph shows a changing height of line instead of just a
straight line, this means that the data is not consistent and this means that the
first rule which is the adequacy check in the regression analysis is not that
achieved through the given data. However, there is one thing confirmed in the
analysis, which is the significant change in the provided data set from the year
performances.
References

• Barone, A. (2021, October 9). How binomial distribution works. Investopedia.


https://www.investopedia.com/terms/b/binomialdistribution.asp
• Binomial distribution. (2022). Statistics and Probability.
https://stattrek.com/probability-distributions/binomial.aspx
• Binomial distribution - StatsDirect. (2021). StatsDirect Statistical Analysis
Software. https://www.statsdirect.com/help/distributions/binomial.htm
• Corporate Finance Institute. (2020, May 14). Binomial distribution - Definition,
criteria, and example.
https://corporatefinanceinstitute.com/resources/knowledge/other/binomial-
distribution/
• Zach. (2021, April 29). What is a directional hypothesis? (Definition &
examples). Statology. https://www.statology.org/directional-hypothesis/
• Sign test. (2021, August 2). Statistics Solutions.
References
• Barone, A. (2021, October 9). How binomial distribution works. Investopedia.
https://www.investopedia.com/terms/b/binomialdistribution.asp
• Binomial distribution. (2022). Statistics and Probability.
https://stattrek.com/probability-distributions/binomial.aspx
• Binomial distribution - StatsDirect. (2021). StatsDirect Statistical Analysis
Software. https://www.statsdirect.com/help/distributions/binomial.htm
• Corporate Finance Institute. (2020, May 14). Binomial distribution - Definition,
criteria, and example.
https://corporatefinanceinstitute.com/resources/knowledge/other/binomial-
distribution/
• Zach. (2021, April 29). What is a directional hypothesis? (Definition &
examples). Statology. https://www.statology.org/directional-hypothesis/
• Sign test. (2021, August 2). Statistics Solutions.
https://www.statisticssolutions.com/free-resources/directory-of-statistical-
analyses/sign-test/
References
• Kenton, W. (2022, February 15). S&P 500 index. Investopedia.
https://www.investopedia.com/terms/s/sp500.asp
• Lavrakas, T. (2021, July 16). Understand the total value of your investments with a total
return. Forbes Advisor. https://www.forbes.com/advisor/investing/what-is-total-return/
• Return. (2003, November 25). Investopedia.
https://www.investopedia.com/terms/r/return.asp
• Net total return definition. (n.d.). Law Insider. https://www.lawinsider.com/dictionary/net-
total-return
• Everything you need to know about hypothesis testing in machine learning. (2021,
September 9). Analytics Vidhya.
https://www.analyticsvidhya.com/blog/2021/09/hypothesis-testing-in-machine-learning-
everything-you-need-to-know
• Residual analysis in regression. (n.d.). Statistics and Probability.
https://stattrek.com/regression/residual-analysis.aspx
• National Institute of Standards & Technology. (n.d.). Normal probability plot.
https://www.itl.nist.gov/div898/handbook/eda/section3/normprpl.
Thank
you!

You might also like