Professional Documents
Culture Documents
BFC 34303 Civil Engineering Statistics: Chi-Square Test and Non-Parametric Tests
BFC 34303 Civil Engineering Statistics: Chi-Square Test and Non-Parametric Tests
𝑍 𝑍
-4 -2 0 2 4 0 2 4 6 8
1
If 𝑍 , 𝑍 , …, 𝑍 are independent standard normal variables, then
𝑍 +𝑍 + ⋯+ 𝑍
𝑥 𝑒
𝑓 𝑥 = 𝑓𝑜𝑟 𝑥 ≥ 0
𝑘
2 Γ 2
2
Chi-square Statistic
The 𝜒 statistic for a random sample of size 𝑛 with a standard deviation 𝑠,
selected from a normal population having a standard deviation 𝜎 can be
calculated using the following equation:
𝑛−1 𝑠
𝜒 =
𝜎
Example 9.1
A manufacturer has developed a new cell phone battery. On average, the
battery lasts 60 minutes on a single charge. The standard deviation is 4
minutes. Suppose the manufacturing department runs a quality control
test. They randomly select 7 batteries. The standard deviation of the
selected batteries is 6 minutes. What is the probability that the standard
deviation will be greater than 6 minutes?
𝑛−1 𝑠 7−1 6
𝜒 = = = 13.5
𝜎 4
3
Degrees of freedom = 𝑛 − 1 = 7 − 1 = 6
Cumulative 𝝌𝟐
Distribution Table
4
Goodness-of-Fit Test
The goodness-of-fit test is one of the most commonly used non-
parametric tests, which was introduced by Karl Pearson.
The purpose of this test is to compare an observed set of frequencies to
an expected set of frequencies.
The chi-square distribution, which is used as the test statistic, has the
following characteristics:
• Chi-square is never negative.
• There is a family of chi-square distributions.
• The chi-square distribution is positively skewed.
𝑓 −𝑓
𝜒 = with 𝑘 − 1 degrees of freedom
𝑓
where
𝑘 = number of categories
𝑓 = observed frequency in a particular category
𝑓 = expected frequency in a particular category
10
5
Example 9.2
11
From the chi-square table with 𝛼 = 0.05 and df = 5, the critical 𝜒 value is
11.07.
12
6
Critical 𝝌𝟐
Table
The shaded
region is
𝜒 =𝜒
13
Decision rule:
If the calculated 𝜒 is greater than or equal to the critical 𝜒 (11.07)
Reject 𝐻
Reject 𝐻
𝜒
Critical 𝝌𝟐
14
7
Since there are 120 samples, we expect that each technician will test 20
samples.
Tom 13 20 2.45
Ryan 33 20 8.45
Tyra 14 20 1.80
George 7 20 8.45
Hannah 36 20 12.80
John 17 20 0.45
Total 120 120 34.40
15
𝑓 −𝑓
𝜒 = = 34.40
𝑓
16
8
Contingency Table Analysis
The chi-square test can also be used for a research involving two traits.
For example:
1. Is there any relationship between the grade point average (GPA) of
students and their income 10 years after graduation?
2. Is there an association between drivers of different vehicle classes and
their compliance with speed limits?
A contingency table analysis is conducted to find the relationship between
two variables.
Information is collected and displayed in a contingency table, which is a
type of table in a matrix format that shows the frequency distribution of the
variables.
17
𝑓 −𝑓
𝜒 = with (𝑟 − 1)(𝑐 − 1) degrees of freedom
𝑓
where
𝑟 = number of rows
𝑐 = number of columns
𝑓 = observed frequency for a cell
𝑅𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 × 𝐶𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙
𝑓 = expected frequency for a cell =
𝐺𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
18
9
Example 9.3
19
From the chi-square table with 𝛼 = 0.05 and df = 3, the critical 𝜒 value is
7.815.
20
10
Critical 𝝌𝟐
Table
The shaded
region is
𝜒 =𝜒
21
Decision rule:
If the calculated 𝜒 is greater than or equal to the critical 𝜒 (7.185)
Reject 𝐻
Reject 𝐻
𝜒
Critical 𝝌𝟐
22
11
Safety Age group of motorcyclists
Total
Helmet 16 - 25 26 - 40 41 - 55 > 55
Usage 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓 𝑓
Used 25 30 33 36 27 24 35 30 120
Did not use 25 20 27 24 13 16 15 20 80
Total 50 60 40 50 200
24
12
𝑓 −𝑓
𝜒 = = 5.729
𝑓
25
13
Reasons for using parametric and non-parametric tests
27
28
14
Difference between Parametric and Non-Parametric Tests (cont’d)
• Cannot be used to test for nominal • Can be used to test for nominal and
data ordinal data
• Parametric tests are powerful, thus • Non-parametric tests are not
we are more likely to detect a powerful like parametric tests
significant effect if it exists
29
30
15
Engineer Before After
Example 9.4 Terry Good Outstanding
Sue Fair Excellent
15 engineers were randomly selected James Excellent Good
to assess their level of competence Ted Poor Good
in using software for design and Andy Excellent Excellent
analysis. The engineers underwent a Sarah Good Outstanding
software training program and were John Poor Fair
rated Outstanding, Excellent, Good, Jim Excellent Outstanding
Fair or Poor before and after the Cody Good Poor
training. Troy Poor Good
Can it be concluded that the Vanessa Good Outstanding
engineers were more competent after Cole Fair Excellent
the training? Candy Good Fair
Arthur Good Outstanding
Sandra Poor Good
31
16
𝐻 : 𝜋 ≤ 0.50 (There is no increase in competence as a result of the training)
Notes:
𝜋 refers to the proportion in the population.
The binomial distribution is used as the test statistic because the sign test meets
the binomial assumptions:
1. There are only two outcomes: “success” and “failure”.
2. Each trial is independent (the performance of one engineer is not related to
another engineer).
3. For each trial the probability of success is assumed to be 𝑝 = 0.50.
4. The total number of trials is fixed, i.e. 𝑛 = 14.
33
34
17
Number of successes Probability of success Cumulative probability
0 0.000 1.000
1 0.001 0.999
2 0.006 0.998
3 0.022 0.992
4 0.061 0.970
5 From
0.122 0.909
6 binomial 0.183 0.787
probabilities
7 table
0.209 0.604
8 0.183 0.395 Cumulative
probability that is
9 0.122 0.212
Critical closest to, but not
10 0.061 0.090
value greater than the
11 0.022 0.029 level of significance
12 0.006 0.007 𝛼 = 0.10
13 0.001 0.001
14 0.000 Add up 0.000
35
Decision:
Since the number of plus signs (11) is greater than the critical value (10),
we reject 𝑯𝒐 . Thus we accept 𝐻 , which states that there is an increase in
competence as a result of the training.
In other words, the training was effective as there is evidence that shows
an increase in level of competence after the training.
36
18
Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is a non-parametric test developed by
Frank Wilcoxon. It is based on the differences in dependent (matched)
samples, where the normality assumption is not required.
This test is the non-parametric alternative of the dependent samples 𝑡-
test.
Two slightly different versions of the test exist:
• The Wilcoxon signed-rank test – compares the sample median
against a hypothetical median.
• The Wilcoxon matched-pairs signed-rank test – computes the
difference between each set of matched pairs, then follows the same
procedure as the signed-rank test to compare the sample against some
median.
37
The null hypothesis for this test is that the medians of two samples are
equal. It is generally used:
• as a non-parametric alternative to the one-sample 𝑡-test or paired 𝑡-
test.
• for ordered (ranked) categorical variables without a numerical scale.
38
19
Example 9.5 Road
Speed during the Speed at night
day (km/h) (km/h)
Vehicle speeds along 10 roads were 1 67.2 65.3
recorded during the day and at night
2 59.4 54.7
in order to study if speeds differ
during the day and at night. The 3 80.1 81.3
average speeds obtained are shown 4 47.6 39.8
in the table.
5 97.8 92.5
At the 0.05 significance level, is 6 57.3 52.4
there evidence to conclude that
7 75.2 79.8
speed of vehicles are different during
the day and at night? 8 94.7 89.0
9 64.3 58.4
10 54.0 56.4
39
Decision rule:
If the calculated Wilcoxon test statistic, 𝑊 is less than or equal to the
critical Wilcoxon value, 𝑊 Reject 𝐻
40
20
Speed Rank Signed Rank
Speed at Absolute
Road during the Difference (Ascending
night (km/h) Difference Order) R+ R–
day (km/h)
1 67.2 65.3 1.9 1.9 2 2
42
21
Critical Values for the
Wilcoxon Signed-Rank
Test
43
Mann-Whitney Test
The Mann-Whitney Test, also known as the Wilcoxon Rank Sum Test, is
specifically designed to determine whether two independent samples
came from equal populations.
This test is an alternative ot the two-sample 𝑡 test we have learned
previously. Unlike the 𝑡 test, the Mann-Whitney Test does not require the
two populations to follow the normal distribution and have equal
population variances.
The Mann-Whitney Test is based on the average of ranks. The data are
ranked as if the observatios were part of a single sample.
If the null hypothesis is true, then the ranks will be about evenly
distributed between the two samples, and the average of the ranks for the
two samples will be about the same.
44
22
If the alternative hypothesis is true, one of the samples will have more of
the lower ranks, and thus, a smaller rank average.
If each of the samples contains 8 or more observations, the standard
normal distribution is used as a test statistic. The formula is:
𝑛 (𝑛 + 𝑛 + 1)
𝑊− 2
𝑧=
𝑛 𝑛 (𝑛 + 𝑛 + 1)
12
46
23
𝐻 : The distribution of no-shows is the same for Penang and Kuala
Lumpur
𝐻 : The distribution of no-shows is larger for Penang than for Kuala
Lumpur
Decision rule:
If the calculated 𝑧 value is less than or equal to the critical 𝑧 value
Reject 𝐻
47
0.45
0.05 (5%)
48
24
Areas under the Standard Normal Curve (z-Table) showing values for P(0 ≤ Z ≤ z)
0 z
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
49
10 3.5 10 3.5
24 16 17 11
Example: There are two 10 no-shows.
The ranks involved are 3 and 4, but 22 15 21 14
we assign the average rank, (3 + 4)/2
25 17
= 3.5
Sum 96.5 Sum 56.5
50
25
We then calculate 𝑧, knowing that 𝑊 = 96.5, 𝑛 = 9 and 𝑛 = 8.
𝑛 (𝑛 + 𝑛 + 1) 9(9 + 8 + 1)
𝑊− 96.5 −
𝑧= 2 = 2 = 1.49
𝑛 𝑛 (𝑛 + 𝑛 + 1) 9(8)(9 + 8 + 1)
12 12
51
Mann-Whitney 𝑈 Test
The Mann-Whitney 𝑈 Test is the non-parametric alternative test to the
independent sample 𝑡 test.
It is a non-parametric test that is used to compare two sample means that
come from the same population, and used to test whether two sample
means are equal or not.
Usually, the Mann-Whitney 𝑈 Test is used when the data is ordinal or
when the assumptions of the 𝑡 test are not met.
52
26
The Mann-Whitney 𝑈 statistic is determined using the following formula:
𝑛 (𝑛 + 1)
for Sample 1: 𝑈 =𝑛 𝑛 + −𝑅 Select the
2
smaller of the
𝑛 (𝑛 + 1)
for Sample 2: 𝑈 =𝑛 𝑛 + −𝑅 two
2
53
54
27
𝐻 : The sample means are equal (samples are taken from identical populations)
𝐻 : The sample means are not equal (samples are not taken from identical
populations)
Decision rule:
If the calculated 𝑈 value is less than or equal to the critical 𝑈 value
Reject 𝐻
55
56
28
Male Female
Next, we assign Score Rank Score Rank
ranks and find the 10 1 36 3
sum of the ranks.
22 2 53 5
(This step is similar to
42 4 54 6
the previous example)
59 8 56 7
61 9 63 10.5
63 10.5 84 14
65 12 88 16
83 13
85 15
90 17
93 18
Sum 109.5 Sum 61.5
57
58
29
Since the calculated 𝑈 (33.5) is greater than the critical 𝑈 (16), we do
not reject 𝑯𝒐 . Thus we accept 𝐻 , which states the sample means are
equal.
We can therefore conclude that there is no difference between the scores
for males and females.
59
30