You are on page 1of 2

INTRODUCTION

Poisson distribution is a method which helps to predict the probability of certain event to occur. It gives
us the approximate probability of a given numbers of events that happening in a fixed time interval.
Poisson distribution is used to describe discrete random variables that count the number of occurrences
in a particular time interval or space. The notation for the Poisson distribution is X~Po(α) where α is a
positive integer and is known as the parameter of the Poisson distribution. In Poisson distribution, E(X),
the expected value and Var(X), the variance of the Poisson distribution is equivalent with the parameter,
α of the distribution. A Poisson distribution is suitable to used when the following assumption is true,

1. Dataset takes the number of an event to occur in a fixed or same time interval or space.
2. Occurrence of an event is independent, which mean the occurrence of the first event does not
affect the occurrence of the second event.

The formula to calculate the probability of particular numbers of event happening in the same time
interval is P(x; μ) = (e-μ) (μx) / x! where

µ = mean or expected value

x = number of occurrence of an event

e = Euler’s number which is approximate to 2.71828

A chi-squared test, also written as χ2 test, is a statistical hypothesis test that is valid to perform when the
test statistic is chi-squared distributed under the null hypothesis, specifically Pearson's chi-squared test
and variants thereof. Pearson's chi-squared test is used to determine whether there is a statistically
significant difference between the expected frequencies and the observed frequencies in one or more
categories of a contingency table. Chi-squared test is also used to observe whether a frequency
distribution fits a specific pattern or not. Chi-squared test is proceed by formula below,

k
2 (O−E)2
χ = ∑ E
i

where k = number of categories

O = observed frequency

E = expected frequency

An expected frequency is a theoretical predicted frequency obtained from an experiment presumed to be


true until statistical evidence in the form of a hypothesis test indicates otherwise. An observed frequency,
on the other hand, is the actual frequency that is obtained from the experiment. The critical region or
rejection region for the chi-square statistic is determined by the level of significance and the degrees of
freedom. The degrees of freedom for the chi-square are calculated using the following formula: v = k-1
where v is the degree of freedom and k is the numbers of categories. In statistics, the number of degrees
of freedom is the number of values in the final calculation of a statistic that are free to vary while a
critical region, also known as the rejection region, is a set of values for the test statistic for which the null
hypothesis is rejected. i.e. if the observed test statistic is in the critical region then we reject the null
hypothesis and accept the alternative hypothesis.
In this assignment, we are going to investigate whether the number of absentees of at least 30 student
in our school for 45 consecutive days using Poisson distribution is a suitable model or not. Since the
school principal claims that the attendance of the students in school per day is unsatisfactory, therefore
Chi-squared Goodness-of-Fit test is also carried out to test whether the school principal’s claim is true or
not.

You might also like