You are on page 1of 51

5.

3 Hypothesis Testing
Chi-square test
Dr. Jyotika Doshi
χ2 Applications
(Pronounced “kie-square”)
• Parametric test (for Quantitative data)
– Test for variance in a normal population (1-sample
variance test) (n  1) s 2
 
2

2
• Non-parametric tests (for Qualitative data)
– Compares observed sample frequencies with
expected frequencies under assumption of the null
hypothesis H0 to be true
2
𝑂−𝐸
– Test statistic: χ2 = , O: observed frequency,
𝐸
E: expected frequency under null hypothesis

7/12/2021 Dr. Jyotika Doshi 3


Qualitative data classification
• Categorical: Nominal or Ordinal
• Subjects classified into one or more categories (groups)
• Gender as ‘male’ or ‘female’
• Patient's condition as 'poor', 'fair', 'good' or 'excellent‘
• Options for a question 'yes', 'no', or 'don't know‘
• Categorical variables with only two categories (such as
'female' or 'male' ): called dichotomous or binary
• Binomial and multinomial distributions

7/12/2021 Dr. Jyotika Doshi 4


Applications: Non-parametric χ2 test
• Test for Independence or Association of two
attributes(for categorical data)
• Test for goodness-of-fit
• Test for Homogeneity - Equality of Population
Proportions (2 or more populations)

7/12/2021 Dr. Jyotika Doshi 5


Test statistic
2
𝑓𝑜−𝑓𝑒
• χ2 = ,
𝑓𝑒
• fo: observed frequency
– fe: expected frequency under null hypothesis

• fe using probability
– Probability p=(x/n)  frequency fe= x = np
– x: number of favorable outcomes
– n: total outcomes

7/12/2021 Dr. Jyotika Doshi 6


Necessary conditions
for the Chi-square based tests
• In general N > 50, N=total observations
• No one accepted cutoff, but general rules are
– No cells with observed frequency = 0
– No cells with the expected frequency < 5
• When a category has expected frequency < 5,
it is often appropriate to combine adjacent
categories to have an expected frequency of
five or more in each category

7/12/2021 Dr. Jyotika Doshi 7


Contingency table
• Also called cross classification table
• Cross-tabulation of two categorical variables
• Contingency table: list all possible contingencies
– List all possible combinations of two variables
• Categories: mutually exclusive and collectively exhaustive
• Cell value in a contingency table: frequency
• Table referred using size (r x c)
– Number of rows: r
– Number of columns: c
• Marginal totals: row totals, column totals
• Grand total N: total number of observations
• Degree of freedom given marginal totals: (r-1)(c-1)
– *assumption: ∑O = ∑E, in marginal totals+
7/12/2021 Dr. Jyotika Doshi 8
Contingency table
• Acceptance of HIV test grouped by marital status
Acceptance of HIV test
Marital status Total
Accepted Rejected
Married 71 415 486
Living with partner 41 181 222
Single 15 35 50
Divorced/widowed/separated 7 23 30
Total 134 654 788

7/12/2021 Dr. Jyotika Doshi 9


Contingency table
fij using marginal totals
• Under null hypothesis, can compute frequencies
using marginal totals: fij = (Ri*Cj)/N
• Acceptance of HIV test grouped by marital status
Acceptance of HIV test
Marital status Total
Accepted Rejected
Married 486*134/788 486
Living with partner 222
Single 50
Divorced/widowed/separated 30
Total 134 654 788
7/12/2021 Dr. Jyotika Doshi 10
Test of independence

7/12/2021 Dr. Jyotika Doshi 11


Test for Independence (or association)
of two attributes
• For qualitative data: chi-square test
• Independence of two variables
– Eye colour and Hair colour
– Marital status and acceptance of HIV test
– Gender and Ice Cream Flavour Preference
– Machine breakdown and Shift
• For quantitative data, use correlation analysis
𝑟 𝑛−2
– t= 2, d.f. (n-2), t with same sign as r
1−𝑟

7/12/2021 Dr. Jyotika Doshi 12


Computing expected frequencies
under assumption of independence
• For two independent events A and B
• Consider
– Attributes of A in rows, attributes of B in columns
• P(AiBj) = P(Ai)*P(Bj) [ joint probability ]
Fij/N = (Fi/N) * (Fj/N)
where Fi:freq.of Ai = row total Ri
Fj:freq.of Bj = column total Tj
• Fij = N * (Fi/N) * (Fj/N) = Fi*Fj/N = (Ri * Cj) /N

7/12/2021 Dr. Jyotika Doshi 13


Example 1: Test of independence
• Test if gender and the choice of ice-cream flavor
are associated or not. Following table shows the
classification of choice of 150 adults.
• Expected Frequencies for Contingency tables
under the assumption of independence
– P(AB) = P(A) P(B)  (Fij/N) = (Ri/N)(Cj/N))
– Fij = Ri*Cj/N
Ice Cream Flavour Preference
Chocolate Vanilla Strawberry Total
Gender Men 20 40 20 80
Women 30 30 10 70
Total 50 70 30 150

7/12/2021 Dr. Jyotika Doshi 14


Example 1: test of independence
• Step 1
– H0: gender and choice of ice-cream flavor are
independent (NOT dependent)
– H1: gender and choice of ice-cream flavor are
dependent
• Step 2: α = 0.05
2
𝑓𝑜−𝑓𝑒
• Step 3: χ2 = , d.f.= (r-1)(c-1)
𝑓𝑒

7/12/2021 Dr. Jyotika Doshi 15


Example 1
observed and expected frequencies
• Expected frequency of cell (i,j) Fij = Ri*Cj/N
• F11=R1*C1/N = 80*50/150 = 26.67
• F13=R1*C3/N = 80*30/150 = 16
• F23=R2*C3/N = 70*30/150 = 14
• ...

Cell values : observed frequency , (expected frequency)


Ice Cream Flavour Preference
Chocolate Vanilla Strawberry Total
Gender Men 20, (26.67) 40, (37.33) 20, (16)
Women 30, (23.33) 30, (32.67) 10, (14)
80
70
Total 50 70 30 150

7/12/2021 Dr. Jyotika Doshi 16


Example 1: chi-square statistic
• Step 4
Ice Cream Flavour Preference
Chocolate Vanilla Strawberry Total
Gender Men 20 (26.67) 40 (37.33) 20 (16) 80
Women 30 (23.33) 30 (32.67) 10 (14) 70
Total 50 70 30 150
• (Fo-Fe)2/Fe • χ2 = ∑ ((Fo-Fe)2/Fe) = 6.12
• C11: (20-26.33)2/26.33 =1.67
• C12: (40-37.33)2/37.33=0.19
• C13: (20-16)2/16 =1.00
• C21: (30-23.33)2/23.33=1.9
• C22: (30-32.67)2/32.67 =0.22
• C23: (10-14)2/14=1.14
7/12/2021 Dr. Jyotika Doshi 17
Example 1
Step 5: rejection rule
• Chi-square table value (critical value)
• d.f. = (r-1)(c-1) = 1*2 =2 α
• χ calc = 6.12, χ tab = 5.99
2 2
0 χ2tab Rejection
• P-value: between 0.025 and 0.05 < (α=0.05)
• Reject H0
• Step 6:
– Conclusion: at 5% significance, Dependence between
gender and choice of ice-cream flavor
– Marketing of ice-cream should be gender based

7/12/2021 Dr. Jyotika Doshi 18


Example 2
test of independence
• Test at 1% significance, whether there is any
relationship between type of Malaria and
tropical region using given data.
region 1 Region 2 Region 3 Totals
Malaria A 31 14 45 90
Malaria B 2 5 53 60
Malaria C 53 45 2 100
Totals 86 64 100 250

7/12/2021 Dr. Jyotika Doshi 19


Example 2 …
• Step 1
– H0: type of Malaria and tropical region are
independent (NOT dependent)
– H1: dependency between type of Malaria and
tropical region
• Step 2: α = 0.01
2
𝑓𝑜−𝑓𝑒
• Step 3: χ2 = , d.f. (r-1)(c-1)
𝑓𝑒

7/12/2021 Dr. Jyotika Doshi 20


Example 2…
• Step 4: = 125.516
(i,j) Expected |O -E|
Observed (O — E)2 (O — E)2/ E
= Ri x Cj / N =|fij-eij|
(1,1) 31 30.96 0.04 0.0016 0.0000516
(1,2) 14 23.04 9.04 81.72 3.546
(1,3) 45 36.00 9.00 81.00 2.25
(2,1) 2 20.64 18.64 347.45 16.83
(2,2) 5 15.36 10.36 107.33 6.99
(2,3) 53 24.00 29.00 841.00 35.04
(3,1) 53 34.40 18.60 345.96 10.06
(3,2) 45 25.60 19.40 376.36 14.70
(3,3) 2 40.00 38.00 1444.00 36.10
χ2calc = 125.516

7/12/2021 Dr. Jyotika Doshi 21


Example 2…
step 4 (different representation)
O, (E )
• Cell value (O, (E), ((O-E)2/E) (O-E)2/E
Just to tally
total O and
(total E)
Region 1 Region 2 Region 3 Totals
31, (30.96) 14, (23.04) 45, (36) 90
Malaria A
0.0000516 3.546 2.25
2, (20.64) 5, (15.36) 53, (24) 60
Malaria B
16.83 6.99 35.04
53, (34.40) 45, (25.60) 2, (40) 100
Malaria C
10.06 14.70 36.10
• χ2calc = 125.516
7/12/2021 Dr. Jyotika Doshi 22
Example 2…
• Step 5: rejection rule
• Degrees of freedom = (r-1) x (c-1) = 2(2) = 4
• p-value (between 0.05 and 0.10) ≤ α (=0.01)
– Reject H0
• χ2calc = 125.510, χ2 tab = 13.277
– χ2calc ≥ χ2 tab, so reject H0
• Step 6: conclusion
– Dependence between type of Malaria and tropical
region
7/12/2021 Dr. Jyotika Doshi 23
Interpreting Chi Square test results
• Chi square test only the dependency between variables
• Does not tell the pattern or nature of the relationship
• To investigate the pattern
– compute % within each cell and compare
HOMICIDE RATE
• based on chi-square test,
GUN SALES Low High Total
no dependency between
Low 8 (66.7%) 5 (38.5%) 13
homicide rate and gun
sales in cities High 4 (33.3%) 8 (61.5%) 12
Total 12 (100%) 13 (100%) N = 25
• Relationship Pattern: higher % in (low, low), (high, high) cells
• Positive relationship between homicide rates and gun sales
7/12/2021 Dr. Jyotika Doshi 24
2 x 2 contingency table
• Four-fold contingency table
• Short-cut Formula for chi-square
B1 B2 Total
A1 a b a+b
A2 c d c+d
Total a+c b+d n
2
𝑛 𝑎𝑑−𝑏𝑐 𝑛 𝑎𝑑−𝑏𝑐 2
• χ2= =
(𝑎+𝑐)(𝑏+𝑑)(𝑎+𝑏)(𝑐+𝑑) 𝑅1∗𝑅2∗𝐶1∗𝐶2

7/12/2021 Dr. Jyotika Doshi 25


Yate’s continuity correction
• Improves the approximation of the discrete sample chi-square statistic to
a continuous chi-square distribution
• When total N for a 2 × 2 chi-square table is less than about 40
• Especially if the expected frequencies are below 10 (some authors say < 5)
• What if d.f. = 0?
– When combining adjusting cells in 2 x 2 table

7/12/2021 Dr. Jyotika Doshi 26


Exercise 1
• The experiment is performed by observing 500
people and each of them are categorized
according to eye color and hair color. Test
whether there is any dependence between eye
color and hair color.

7/12/2021 Dr. Jyotika Doshi 27


Exercise 2
• The severity of a disease and blood group were studied in a
research project taking a sample of 1500 patients. The
findings are given in the following contingency table. Are
the severity of the disease and blood group are associated?

7/12/2021 Dr. Jyotika Doshi 28


Exercise 3
• In order to determine the possible effect of a chemical
treatment on the rate of germination of cotton seeds a
pot culture experiment was conducted. The results are
given below. Does the chemical treatment improve the
germination rate of cotton seeds at 1 % level?

7/12/2021 Dr. Jyotika Doshi 29


Exercise 4
• To study the relation between blood type and disease, large
samples of patients with different disease (peptic ulcer,
gastric cancer, no disease) and different blood groups (O, A,
B) were classified as follows. Test the hypothesis that there
is no association between blood type and disease, at 1%
level of significance.
Diseases
Blood Peptic Gastric No
Type Ulcer Cancer disease
O 983 383 2892
A 679 416 2625
B 134 84 570
7/12/2021 Dr. Jyotika Doshi 30
Goodness of fit

7/12/2021 Dr. Jyotika Doshi 31


χ2 test
test goodness of fit
2
𝑓𝑜−𝑓𝑒
• Test statistic : χ2 = 𝑓𝑒
• Number of categories: n
• degree of freedom = n – 1 – number of
estimated parameters
• For binomial and normal distribution, two
parametres

7/12/2021 Dr. Jyotika Doshi 32


Example 3: goodness of fit
(test of proportions)
• A six-sided die is rolled 120 times. Following data shows
occurrences of face values. Conduct a hypothesis test to
determine if the die is fair. 1. H0: p1=p2=p3=p4=p5=p6
• Multinomial distribution 2. α = 0.05
Face Observed Expected
(O-E)2/E
3. χ2 = ∑,(O-E)2/E}, d.f. n-1=5
Value Frequency frequency 4. χ2calc = 13.6
1 15 20 1.25 5. χ2tab at (α=0.05) = 11.07
2 29 20 4.05 χ2calc > χ2tab  reject H0
3 16 20 0.80 P-value = P(χ2>χ2calc)
4 15 20 1.25 between 0.02 and 0.025
5 30 20 5.00 P-value>0.05  reject H0
6 15 20 1.25 6. Die is biased

7/12/2021 Dr. Jyotika Doshi 33


Example 4: goodness of fit
(test of homogeneity)
• Three different coins are flipped 200 times. Test the claim
that all three coins have the same probability of landing
heads (all 3 coins same). Let α=0.10. The experimental
results are as follows: coinA coinB coinC total
– Coin A: 88 heads,112 tails head 88 93 110 291
– Coin B: 93 heads, 107 tails tail 112 107 90 309
– Coin C: 110 heads, 90 tails total 200 200 200 600
• H0:pA=pB=pC (all three coins have the same probability of
landing heads; all 3 coins same)
• Remember: include all categories, all observations
• α=0.10
• Test statistic: χ2, d.f. (r-1)(c-1) = 4

7/12/2021 Dr. Jyotika Doshi 34


Example 4…
coin1 coin2 coin3 total
• Observed frequencies head 88 93 110 291
tail 112 107 90 309
total 200 200 200 600
• Expected frequencies 97 97 97
103 103 103
0.835052 0.164948 1.742268
• ((O-E)^2)/E 0.786408 0.15534 1.640777
• χ2calc = 5.325
• χ2calc = 5.325 > χ2tab = 4.61  reject H0
• enough evidence to reject the claim that all three
coins have the same probability of landing heads
7/12/2021 Dr. Jyotika Doshi 35
Example 5: Goodness of fit
(test of proportions)
• In a bag of Gems (M&M), my friend claim there are 30%
yellow, 30% brown, 20% red and 20% maroon. Use the p-
approach with significance level 0.05 to test your friend's
claim. Observed numbers are: Yellow:58, Brown:61,
Red:55,Maroon:46 in Total:220.
• H0:pY=0.3,pB=0.3, pR=0.2, pM=0.2
• α=0.10
• Test statistic: χ2, d.f. (n-1) = 3
• Expected freq. Under Ho:
– fY=220*0.3=66, fB=220*0.3=66, fR=220*0.2=44, fM=44
• χ2 = ∑ (((O-E)^2)/E): ((58-66)^2/66) + ((61-66)^2/66) + ((55-
4)^2/44) + ((46-44)^2/44) = 0.97 + 0.38 + 2.75 + 0.09 = 4.19

7/12/2021 Dr. Jyotika Doshi 36


Example 5…
• χ2calc = 4.19, d.f. = 4-1 = 3
• P-value at 3 d.f.: p(χ2 > χ2calc ) is between 0.2
and 0.25 > (α=0.10 )  do not reject H0
• Conclusion:
– No enough evidence to reject friends claim of 30%
yellow, 30% brown, 20% red and 20% maroon in
bag

7/12/2021 Dr. Jyotika Doshi 37


Example 6: goodness of fit
(binomial distribution)
• The following data shows the no. of seeds germinating
out of 10 on damp filter for 80 sets of seeds. Can we
say that number of germinating seeds follow binomial
distribution with p=0.2175?

• H0: number of germinating seeds ~ B(10,0.2175)


• H1: number of germinating seeds does not follow
B(10,0.2175)

• α=0.05
• Test statistic: χ2 = ∑ ( ((O-E)^2) / E)

7/12/2021 Dr. Jyotika Doshi 38


Example 6…
Testing goodness of fit:X~B(10, 0.2175)
X fo P(X=x) fe=N*P(x) fo-fe ((fo-fe)^2)/fe
0 6 0.08607 6.88547 6.88547 -0.88547 0.113871
1 20 0.23923 19.13852 19.13852 0.861479 0.038778
2 28 0.29923 23.93843 23.93843 4.061563 0.689113
3 12 0.22179 17.74350 17.74350 -5.74351 1.859151
4 8 0.10789 8.63083 12.29407 1.705932 0.337187
5 6 0.03598 2.87878
6 0 0.00834 0.66681
7 0 0.00132 0.10591
8 0 0.00014 0.01104
9 0 8.52E-06 0.00068
10 0 2.37E-07 1.89E-05
Sum 80 1 80 χ2=3.0381
7/12/2021 Dr. Jyotika Doshi 39
Example 6…
• χ2calc = 3.0381
• α=0.05
• Degree of freedom = number of categories - 1 -
number of estimated parameters = 5-1-0=4
– Both n and p are given
– If p is not given, compute mean using observed
frequencies and find p using mean=np
• χ2tab = 9.49 at α=0.05 , d.f. = 4
• Not to reject H0  X~B(10, 0.2175)
• P-value at 4 d.f.: p(χ2 > χ2calc ) > 0.25 > (α=0.10 )  do
not reject H0
• Conclusion: no evidence to reject Ho, number of
germinating seeds follow B(10, 0.2175)
7/12/2021 Dr. Jyotika Doshi 40
Fitting of normal distribution
• CHEMLINE EMPLOYEE APTITUDE TEST SCORES FOR 50
RANDOMLY CHOSEN JOB APPLICANTS
71 66 61 65 54 93 60 86 70 70 73 73 55 63 56 62 76 54
82 79 76 68 53 58 85 80 56 61 61 64 65 62 90 69 76 79
77 54 64 74 65 65 61 56 63 80 56 71 79 84
• H0: The population of test scores has a normal
distribution with mean 68 and standard deviation 10
• Ha: The population of test scores does not have a
normal distribution with mean 68 and standard
deviation 10
• Form classes/bins, classify data, find prob., compute
expected frequencies = Np, compute chi-square …

7/12/2021 Dr. Jyotika Doshi 41


Testing goodness of fit
Fitting of normal distribution …
• Computing chi-square for X~N(68, 10)
LL UL F0 P(LL<X<UL) Fe=NP F0 Fe ((O-E)^2)/E
50 55 5 0.06087 3.043508 5 3.043508 1.257713
55 60 6 0.115055 5.752746 6 5.752746 0.010627
60 65 14 0.170233 8.511659 14 8.511659 3.538897
65 70 5 0.197171 9.858557 5 9.858557 2.394425
70 75 5 0.178777 8.938832 5 8.938832 1.735618
75 80 9 0.126894 6.344699 15 11.92483 0.793021
80 85 3 0.070504 3.52521
85 90 2 0.030662 1.533101
90 95 1 0.010436 0.521824
50 0.960603 9.019727
• Conclude results
7/12/2021 Dr. Jyotika Doshi 42
Goodness of fit
Parametric test: R2
• Goodness of fitting regression equation:
• Parametric test
• Quantitative value
• Use Coefficient of determination R2

7/12/2021 Dr. Jyotika Doshi 43


Exercises

7/12/2021 Dr. Jyotika Doshi 44


Exercise: Test of homogeneity
• Three samples of mobile users are selected randomly from
3 states to test proportion of android users. Data is given in
the following table. ‘yes’ represents android user. Test if all
three population proportions are almost identical.
Test for homogeneity of Several Population Proportions
Populations Yes No Total
Sample I 60 40 100
Sample II 57 53 110
Sample III 48 72 120
Total 165 165 330

7/12/2021 Dr. Jyotika Doshi 45


ANOVA or Chi-square test?
• The data about the number of units of
production per day turned by 4 different
workers using 5 different types of machines
are given. At 5% significant level, can it be
concluded that the workers do not differ as far
as their productivity is concerned and the
machines do not differ as far as productivity is
concerned?
• ANOVA, Given measures and not frequencies

7/12/2021 Dr. Jyotika Doshi 46


Exercise
• As the price of oil rises, there is increased worldwide interest in
alternate sources of energy. A Financial Times/Harris Poll surveyed
people in six countries to assess attitudes toward a variety of
alternate forms of energy (Harris Interactive website, February 27,
2008). The data in the following table (on next slide) are a portion
of the poll’s findings concerning whether people favor or oppose
the building of new nuclear power plants.
1. How large was the sample in this poll?
2. Conduct a hypothesis test to determine whether people’s attitude
toward building new nuclear power plants is independent of
country. What is your conclusion?
3. Using the percentage of respondents who “strongly favor” and
“favor more than oppose,” which country has the most favorable
attitude toward building new nuclear power plants? Which
country has the least favorable attitude?

7/12/2021 Dr. Jyotika Doshi 47


Data of prev. execrice
Response Country
Great France Italy Spain Germany United
Britain States
Strongly favor 141 161 298 133 128 204
Favor more than 348 366 309 222 272 326
oppose
Oppose more than 381 334 219 311 322 316
favor
Strongly oppose 217 215 219 443 389 174
1. How large was the sample in this poll?
2. Conduct a hypothesis test to determine whether people’s attitude
toward building new nuclear power plants is independent of
country. What is your conclusion?
3. Using the percentage of respondents who “strongly favor” and
“favor more than oppose,” which country has the most favorable
attitude toward building new nuclear power plants? Which country
has the least favorable attitude?
7/12/2021 Dr. Jyotika Doshi 48
Exercise
• A newspaper publisher trying to pinpoint his market characteristics
wondered whether newspaper readership is related to reader’s
educational achievement. A survey questioned adults in the area on
their level of education and their frequency of readership. The
results are shown in the following table. At the 0.10 level of
significance, does the frequency of newspaper readership differ
according to the reader’s educational level? Use the chi-square test.
Level of educational achievement
Frequency of Post- College High school High school Total
readership Graduat Graduate Pass Not pass
e
Never 10 17 11 21 59
Sometimes 12 23 8 5 48
Morning/Evening 35 38 16 7 96
Both editions 28 19 6 13 66
Total 85 97 41 46 269
7/12/2021 Dr. Jyotika Doshi 49
Exercise
• Using the following data of 160 patients, test
at 5% level of significance, the hypothesis that
the drug is no better than sugar pills for curing
cold. (why not to use two-sample Z-test for
mean or for proportions?)
HELPED HARMED NO EFFECT TOTAL
DRUG 50 12 18 80
SUGAR PILLS 40 14 26 80
TOTAL 90 26 44 160

7/12/2021 Dr. Jyotika Doshi 50


Exercise
• It is believed that people who die from overdoses of
narcotics die rather young. Test this based on the following
distribution of number of deaths from overdoses:
• [Hint: An appropriate H0 hypothesis would be that equal
numbers die in all seven age groups (i.e. fe = 147/7 = 21)]
Age 15-19 20-24 25-29 30-34 35-39 40-44 45-49
interval
Number 40 35 32 10 13 13 4
of deaths

7/12/2021 Dr. Jyotika Doshi 51


Exercise
• Often frequency data are tabulated according to two criteria, with a
view toward testing whether the criteria are associated. Consider
the following analysis of the 157 machine breakdowns during a
given period. Test whether the same percentage of breakdown
occurs on each machine during each shift or whether there is some
difference due to untrained operators and/or other factors peculiar
to a given shift.
MACHINE
A B C D Total per shift
Shift 1 10 6 13 13 41
Shift 2 10 12 19 21 62
Shift 3 13 10 13 18 54
Total per machine 33 28 44 52 157

7/12/2021 Dr. Jyotika Doshi 52

You might also like