Professional Documents
Culture Documents
Advanced Statistics:
Jarque-Bera Test
By:
Vitug, Arianne
III-MATH
SIGN TEST
Many of the hypothesis tests studied so far have imposed one or more requirements for a popular
distribution – such as the population being normal or the variances being equal. A nonparametric test is
a hypothesis test that does not require any specific conditions concerning the shape of the populations or
the value of any population parameters. It is sometimes called distribution free statistics because they do
Another important reason for using these tests is that they allow for the analysis of categorical as
well as rank data. It is widely used for studying populations that take on a ranked order such as movie
review that receives one to four stars. Nonparametric tests are usually easier to perform than parametric
tests. However, they are usually less efficient than parametric tests.
One of the easiest nonparametric tests to perform is the sign test. It is a nonparametric test that can
This is the quickest and simplest nonparametric method. In this case, we consider testing samples from
the same population. In the sign test, we are testing the hypothesis on the median (M) rather than the
mean
The general steps of hypothesis testing can still be followed. The procedural steps to be followed are as
follows:
Step 1: Hypothesis
Null hypothesis:
1
H o : M =M 0 OR H o : p=
2
Alternative hypothesis:
1
H1: M ≠ M o OR H1: p ≠ (two tailed test)
2
1
H1: M > M o OR H 1 : p> (upper one-tailed test)
2
1
H1: M < M o OR H 1 : p< (lower one-tailed test)
2
Note: In this module we will only consider the upper tailed and two tailed hypothesis.
Zα
Reject H o if |Z|> (two-sided hypothesis) OR
2
1
● Obtain p: p= ∨0.5
2
negative outcomes.
■ q = 1 – p = ½.
● Using the normal approximation to the Binomial distribution. We can say that
n n
R N ( np , np ( 1− p ) )=N ( , ) where
2 4
n n
● np∨ is the mean∧np ( 1− p )∨ is the variance
2 4
● For the two-sided hypothesis we consider r =minimum ¿ and for the one-sided upper tail
1
■ P ( R ≥ r ) =P ( R>r− ) one tailed hypothesis
2
1
■ P ( R ≤ r ) =P ( R<r + ) two-tailed hypothesis
2
r ±0.5−μ n 2 n
■ z= Note that μ= ∧σ =
σ 2 4
Step 4: Decision
Reject H o if
Zα
|Z|> (two-sided hypothesis)
2
Z> Z a (one-sided upper tail hypothesis)
Step 5: Conclusion
EXAMPLE 1
particular course. When tried out on 20 students of another course, it gave the scores: 26, 46, 39, 58, 62,
41, 65, 49, 54, 50, 61, 38, 58, 35, 27, 34, 46, 51, 29, 40. Test the hypothesis that the median is not 50 at
5% level of significance.
SOLUTION
Step 1: Hypothesis
H o : M =50
H 1 : M ≠ 50
α 0.05
α =0.05 → two−sided = =0.025
2 2
Zα
Since it is a two-sided, Reject H o if |Z|≥ =1.96
2
Calculate the n by adding the number of positive signs (greater than the median) and negative
26, 46, 39, 58, 62, 41, 65, 49, 54, 50, 61, 38, 58, 35, 27, 34, 46, 51, 29, 40
Median is 50.
n+¿=¿¿ 7
n−¿=¿¿ 12
Solving for r,
r =¿
R N (9.5 , 4.75)
1
(
P ( R ≤ r ) =P R<r +
2)=P ( R<7+ 0.5 )=P(R<7.5)
r ±0.5−μ
z=
σ
7+0.5−9.5
z=
√ 4.75
z=−0.92
Step 4: Decision
Zα
Reject H o if |Z|≥ =1.96
2
|−0.92|=0.92
0.92 <1.96
∴ Fail ¿ reject H o
Step 5: Conclusion
EXAMPLE 2
The following are measurements of breaking strength of a certain kind of 2-inch cotton ribbon in pounds:
163 165 160 189 161 171 158 151 169 162
163 169 172 165 148 166 172 163 187 173
Use the sign testing to test the null hypothesis M = 160 against the alternative hypothesis M > 160 at the
SOLUTION:
Step 1: Hypothesis
H o : M =160
H 1 : M >160
163 165 160 189 161 171 158 151 169 162
163 169 172 165 148 166 172 163 187 173
M = 160
n+¿=¿¿ 15
n−¿=¿¿ 4
Solving for r,
r =n+¿=15 ¿
R N (9.5 , 4.75)
1
(
P ( R ≥ r ) =P R>r −
2)=P ( R> 15−0.5 )=P(R<14.5)
r ±0.5−μ
z=
σ
15−0.5−9.5
z=
√ 4.75
z=2.29
Step 4: Decision
2.29>1.96
∴ Reject H o
Step 5: Conclusion
⮚ Suppose samples are taken for comparison from the same population, we have to note that we
Step 1: Hypotheses
Null hypothesis
1
H o : M Diff =0 OR H o : p=
2
Alternative hypotheses:
1
H 1 : M Diff ≠ o OR H1: p ≠ (two-tailed test)
2
1
H 1 : M Diff > o OR H 1 : p> (upper one-tailed test)
2
1
H 1 : M Diff < o OR H 1 : p< (lower one-tailed test)
2
Note: In this module we will only consider the upper tailed and two tailed hypothesis.
Reject H o if
Zα
|Z|> (two-sided hypothesis)
2
❖ Find the difference between the two samples (data set) and label accordingly
1
❖ Obtain p: p= ∨0.5
2
❖ Using the normal approximation to the Binomial distribution. We can say that
n n
R N ( np , np ( 1− p ) )=N ( , ) where
2 4
n n
np∨ is themean∧np ( 1− p )∨ is the variance
2 4
❖ For the two-sided hypothesis we consider r =minimum¿ and for one-sided upper tail
1
P ( R ≥ r ) =P ( R>r− ) one tailed hypothesis, subtract from R if r > n/2
2
1
P ( R ≤ r ) =P ( R<r + ) two-tailed hypothesis, add to R if r < n/2
2
r ±0.5−μ n 2 n
z= Note that μ= ∧σ =
σ 2 4
Step 4: Decision
Reject H o if
Zα
|Z|> (two-sided hypothesis)
2
EXAMPLE 1
A B
1 36.3 35.1
2 48.4 46.8
3 40.2 37.3
4 54.7 50.6
5 28.7 29.1
6 42.8 41.0
7 36.1 35.3
8 39.0 39.1
9 36 36
Use the sign test to determine whether the data present sufficient evidence to indicate that one of the
1
treatments tends to be consistently more efficient than the other; that is, P ( Y A >Y B ) ≠ . Test by using
2
α =0.05
SOLUTION
Step 1: Hypothesis
1
H o : M Diff =0 OR H o : p=
2
1
H 1 : M Diff ≠ o OR H1: p ≠
2
α
α =0.05 → two−sided =0.025
2
Zα
Since it is a two-sided, Reject H o if |Z|> =1.96
2
A B (+,-)
1 36.3 35.1 1.2 +
2 48.4 46.8 1.6 +
3 40.2 37.3 2.9 +
4 54.7 50.6 4.1 +
5 28.7 29.1 -0.4 -
6 42.8 41.0 1.8 +
7 36.1 35.3 0.8 +
8 39.0 39.1 -0.1 -
9 36 36 0 Ignore
n+¿=6 ¿
n−¿=2 ¿
n=¿
Solving for r,
r =¿
R N (4 , 2)
1
(
P ( R ≤ r ) =P R<r +
2)=P ( R<2+0.5 )=P( R<2.5)
r ±0.5−μ
z=
σ
2+0.5−4
z=
√2
z=−1.06
Step 4: Decision
Zα
Reject H o if |Z|≥ =1.96
2
|−1.06|=1.06
1.06 <1.96
∴ Fail ¿ reject H o
Step 5: Conclusion
Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov test is used to test the null hypothesis that a set of data comes from a
Normal distribution. The Kolmogorov Smirnov test produces test statistics that are used (along with a
Null Hypothesis - A null hypothesis is a statement, in which there is no relationship between two variables.
The two sample Kolmogorov-Smirnov test is a nonparametric test that compares the cumulative
distributions of two data sets(1,2). ... The KS test report the maximum difference between the two
cumulative distributions, and calculates a P value from that and the sample sizes
SPSS Kolmogorov-Smirnov Test for Normality
The Kolmogorov-Smirnov test examines if scores are likely to follow some distribution in some
In theory, “Kolmogorov-Smirnov test” could refer to either test (but usually refers to the one-
sample Kolmogorov-Smirnov test) and had better be avoided. By the way, both Kolmogorov-
So say I've a population of 1,000,000 people. I think their reaction times on some tasks
are perfectly normally distributed. I sample 233 of these people and measure their reaction times.
Now the observed frequency distribution of these will probably differ a bit -but not too
much- from a normal distribution. So I run a histogram over observed reaction times and
superimpose a normal distribution with the same mean and standard deviation. The result is
shown below.
The frequency distribution of my scores doesn't entirely overlap with my normal curve.
Now, I could calculate the percentage of cases that deviate from the normal curve -the
percentage of red areas in the chart. This percentage is a test statistic: it expresses in a single
number how much my data differ from my null hypothesis. So it indicates to what extent the
Now, if my null hypothesis is true, then this deviation percentage should probably be quite
small. That is, a small deviation has a high probability value or p-value.
Reversely, a huge deviation percentage is very unlikely and suggests that my reaction times don't
follow a normal distribution in the entire population. So a large deviation has a low p-value. As
a rule of thumb, we reject the null hypothesis if p < 0.05.So if p < 0.05, we don't believe that
So that's the easiest way to understand how the Kolmogorov-Smirnov normality test
works. Computationally, however, it works differently: it compares the observed versus the
curves as its test statistic denoted by D. In this chart, the maximal absolute difference D is (0.48 -
0.41 =) 0.07 and it occurs at a reaction time of 960 milliseconds. Keep in mind that D = 0.07 as
Jarque-Bera Test
In statistics, the Jarque–Bera test is a goodness-of-fit test, it is used to test if sample data
This test is named after Carlos Jarque a Mexican economist, currently Executive Director
Economics. They derived it while working on their Ph.D. Thesis at the Australian National
University.
The Jarque-Bera Test, is a test for normality. Normality is one of the assumptions for many
statistical tests, like the t test or F test; the Jarque-Bera test is usually run before one of these tests
to confirm normality. It is usually used for large data sets, because other normality tests are not
● Data in a Vector.
A normal distribution has a skew of zero (i.e. it’s perfectly symmetrical around the mean) and a
kurtosis of three; kurtosis tells you how much data is in the tails and gives you an idea about how
“peaked” the distribution is. It’s not necessary to know the mean or the standard deviation for the
data in order to run the test. This test statistic is always positive, and if it is not close to zero, it
(
[
JB=¿ n √❑
❑
)
]
where :
√❑
n=sample ¿ ¿
n
1 3
∑ ( x i−x )
n
skewness= i=1
❑
n
1 4
∑ ( xi −x )
n
kurtosis= i =1
❑
If JB > x 2 ( a ,2 ) , then the null hypothesis is rejected, meaning the data is not normally distributed.
The null hypothesis for the test is that the data is normally distributed. The alternate hypothesis is
2 Positive skewness means that the distribution has a long right tail, it's skewed to the right.
3 Negative skewness means that the distribution has a long left tail, it is skewed to the left.
For example: Stock returns are known to be leptokurtic, i.e more “peaked” and fat-tailed than the
The Jarque-Bera test uses these two (statistical) properties of the normal
distribution, namely:
4 Compare the Jarque-Bera test statistic with the critical values in the
Example:
n
1 3 −51.4
∑ ( x i−x ) = 3
=−0.2212
n i=1 2
( 37.8 )
skewness=
❑
n
1
∑ ( x −x )4= 4,203.8
n i =1 i ( 37.8 )2
=2.9421
kurtosis=
❑
(
[
JB=n √ ❑
❑
)
]
x 2 ( 0.005 ; 2 )=5.99
5% significance level. In other words, the data does not come from a normal distribution. A
Unfortunately, most statistical software does not support this test. In order to interpret results,
you may need to do a little comparison (and so you should be intimately familiar with hypothesis
testing). Checking p-values is always a good idea. For example, a tiny p-value and a large chi-
square value from this test means that you can reject the null hypothesis that the data is normally
distributed.
distribution with two degrees of freedom, so the statistic can be used to test the hypothesis that
the data are from a normal distribution. The null hypothesis is a joint hypothesis of the skewness
being zero and the excess kurtosis being zero. Samples from a normal distribution have an
expected skewness of 0 and an expected excess kurtosis of 0 (which is the same as a kurtosis of
3). As the definition of JB shows, any deviation from this increases the JB statistic.
Step 1: Input the data. First, input the dataset into one column
Skewness = =SKEW(A2:A16)
Kurtosis = =KURT(A2:A16)
JB test Statistics = =(C2/6)*(C3^2+(C4^2/40))
Score
10 n(Sample size) 15
8 S (Skewness) 0.118547
9 C (kurtosis) -0.76348
5 jb (test Statistics) 0.071565
7
6
6
9
6
6
8
5
7
8
4
Recall that under the null hypothesis of normality, the test statistic JB follows a Chi-Square
distribution with 2 degrees of freedom. Thus, to find the p-value for the test we will use the
following function in
The p-value of the test is 0.96485. Since this p-value is greater than 0.05, we succeed in rejecting
the null hypothesis. We have enough evidence to say that the dataset is not normally distributed.
Score
10 n 15
8 s 0.118547
9 c -0.76348
5 jb 0.071565
7
6 p-value 0.96485
6
9
6
6
8
5
7
8
4
Reference:
https://www.spss-tutorials.com/spss-kolmogorov-smirnov-test-for-normality
https://www.graphpad.com/guides/prism/latest/statistics/interpreting_results_kolmogorov-
smirnov_test.htm
https://keydifferences.com/difference-between-null-and-alternative-hypothesis.html
https://collinsdwight.medium.com/jarque-bera-test-of-normality-a108a1515b22
https://www.statisticshowto.com/jarque-bera-test
https://digensia.wordpress.com/2012/05/07/the-jarque-bera-test-for-normality-testing
https://www.r-bloggers.com
https://www.statology.org/jarque-bera-test-excel
https://www.youtube.com/watch?v=ZbjsXS8oKfo
https://onlinepubs.trb.org/onlinepubs/nchrp/cd-22/manual/v2appendixc.pdf
https://www.sagepub.com/sites/default/files/upm-binaries/40007_Chapter8.pdf
https://www.youtube.com/watch?v=ztmua4TrLLM