You are on page 1of 33

THE CHI-SQUARE

TEST

BY
V.CHAKRAPANI RAMYA
2005 BATCH
GANDHI MEDICAL COLLEGE
INTRODUCTION

After making observations / experiments in medical


Problems, next is the stage of interpretation of the results
or drawing statistical conclusions.

The results of observations / experiments in medical


problems , like, means / proportions etc…may vary from
sample to sample.

So we want to know the significance of the difference


observed in our work compared with that of the
population/another sample.
Continued…..
We express this in terms of probability of its
occurrence by chance and is stated based on
sampling distribution.

There are 2 basic methods of drawing conclusions/


knowing the significance of the results obtained.
The methods
 1. Estimation of a population parameter
from a sample statistic.

 2.The testing of hypotheses about the


population parameter.
ESTIMATION OF POPULATION
PARAMETER
 Calculate the sample parameter <x>
 Set up the limits around population
mean(µ).
(confidence limits)
so as per the NORMAL DISTRIBUTION,
we say with confidence that the sample
mean lies with in confidence limits of the
population as follows.
Confidence limits
. With in [ µ+1.96 (SE Xs)] and [µ-1.96(SE
Xs)] limits 95% of the sample means
occur
. With in [µ +or- 2.58(SE Xs)] limits 99%
of the sample means occur AS PER normal
distribution.
. Conversely ,population mean will also lie
in the confidence limits.
Testing the HYPOTHESIS

 HYPOTESIS: The statement of belief about


population parameter.
 Steps:
 1.State the research question in terms of
statistical hypothesis.
 2. Decide on appropriate statistical tests.
 3. Select the level of significance for the
statistical test.
Statistical Hypotheses
 Null hypothesis (Ho) :is a statement
claiming that there is no difference
between the corresponding statistic of the
sample and that of the population.

 This nullifies the claim that the


experimental result is different from/
better than the one observed already.
Alternate hypothesis(H1)
 Also HA
 This hypothesizes that there is a
significant difference , stating that the
result is different from the one observed
already.

 This statement disagrees with null


hypothesis.
Contd…
 If null hypothesis is rejected ( as a result
of sample evidence) then alternate
hypothesis is concluded.
 If the sample evidence is insufficient to
reject the null hypothesis it is retained but
not accepted per se.
Continued…….

 4. Determine the value , the statistical test


must attain to be declared significant.

 5. Perform the calculation.

 6. Draw and state the conclusion in words.


Tests of significance
 Definition : the mathematical methods by
which the probability (p) / the relative
frequency of an observed by chance is
found.
 The difference can be between means/
two proportions of sample and universe or
between estimates of an experiment and
control groups.
 These are useful to draw conclusions.
Common tests use
 The Z test; the T test ; the X^2 test.

 Z and T tests : express difference in terms of


standard error which is a measure of variation
in sample estimates that occur by chance.

 The value of ratio between observed


difference and std. error ;is that of the table
which gives the highest obtainable values
compared with difference occurring by chance at
different levels of significance.
The chi-square test
 A non-parametric test.

 The statistic follows a special distribution


called the chi–square distribution.

 Very useful in research.


Data
Two kinds :
 1.That follows specific distribution.

 2.That does not or skewed.


a) rescaling
b) apply non parametric tests.
Chi square test
 The chi-square logic.
 The Chi-square distribution.
 Uses of chi-square.
 Formulae.
 Restrictions in its application.
 Examples
 Overuse of chi-square.
 Summary.
chi square logic….Observed
frequencies
Treatment Control Total

Positive 40 10 50

Negative 60 90 150

Total 100 100 200


Expected frequencies
Treatment Control Total

POSITIVE 25 25 50

NEGATIVE 75 75 150

TOTAL 100 100 200


 A chi-square test may be applied on a
contingency table for testing a null hypothesis of
independence of rows and columns.

 As an example of the use of the Chi-square test,


a fair coin is one where heads and tails are
equally likely to turn up after it is flipped.
Suppose one is given a coin and asked to test if
it is fair. After 100 trials, heads turn up 53 times
and tails result 47 times. The following is a Chi-
square analysis, where the null hypothesis is
that the coin is fair.
 In this case, the test has one
degree of freedom and the chi-square
value is 0.36. In order to see whether this
result is statistically significant, the P-value
(the probability of this result not being
due to chance) must be calculated or
looked up in a chart. The P-value is found
to be . There is thus a probability of about
55% of seeing data that deviates at least
this much from the expected results if
indeed the coin is fair. This probability is
not considered statistically significant
evidence of an unfair coin.
Expected value
 Is

(row total) (column total)


E = --------------------------------------
grand total
Applications/uses
 1.chi square test for independence
 2.association
 3.goodness of fit
 Respectively as an alternate test to find the
significance of difference between 2 or more
proportions,
 to test the association between two events in
binomial or multinomial samples, and
 To determine if actual numbers are similar to the
expected numbers.
Chi square distribution
 The chi square distribution has degrees of
freedom.

 In chi square test for independence the


number of degrees of freedom is equal to
df= (r-1)(c-1)
Formulae

Where
 Oi = an observed frequency;
 Ei = an expected (theoretical) frequency,
asserted by the null hypothesis;
 n = the number of possible outcomes of
each event
Calculation of chi square values
Three essential requirements to rule out chance
variation in any observed value:

 1. a random sample.

 2. qualitative data.

 3. lowest E frequency not <5.

( if Yates correction is applied E can be less than


30. for test of proportion.)
Calculation of chi square
 Example :apply chi
square test to find out
vaccine attack Not total
whether the difference ed attacked
seen in the table is by
chance or is really
vaccine A is superior to A 22 68 90
vaccine B as it seems.
 The E will be for A 18.36 B 14 72 86
of attacked and 71.55
 The E for B for attack Total
rate is 17.5 and not 36 140 176
attacked is 68.37
 Applying the chi square
test:
Steps
 Hypothesize
 Represent data in a table .
 Determine E in each group.
 Row totals and column totals are calculated.
 Do the calculation.
 Calculate DF
 Refer to the Fischer's chi square table
 Whether the value is significant
 Hypothesis is accepted or rejected.
 Final statement.
Contd…
 The chi square value will be 1.79

 Degree of freedom will be 1

 Turning to the published probability tables:


Fischer chi square table :

 For a probability of 0.05 with degree of freedom


1 the value is 3.84.

 Since the observed value is much lower we


conclude that null hypothesis is true.
Restrictions in its application
 The test applied in a four fold table will not give a
reliable result with 1 df if E is < 5. in such cases
apply Yates correction i.e., reduction of O-E by
half (0.5)

 Even after this the test may be misleading if any E


frequency will be much<5….apply other
appropriate tests.

 For tables larger than 2x2 Yates correction


cannot be applied …. A method called
amalgamation can be followed.
Contd….

 Interpret the value of chi square with


caution if sample total or total of the value
in all the cells is less than 50.

 Does not measure the strength of


association.

 Choose the simplest explanation that is in


accord with facts.
Important…..

 The chi square test should not be used if


an expected frequency is less than 5.
 If any expected frequency is <2 or >20%
of the expected frequencies are <5 then
an alternate procedure called Fischer's
exact test should be performed.
 Yates continuity correction: subtracting
0.5 from the difference of observed and
the expected frequencies.
Over use of chi square
 Since it is an easy test it is often over
used.

 A common misuse of it occurs when 2


groups are being analyzed and the
characteristic of interest is measured on a
numerical scale. Instead of correctly using
the t test researchers convert the
numerical scale to an ordinal or even a
binary scale and then use the chi square,.
References

 Park text book of SPM


 Mahajan methods in biostatistics
 Basic and clinical biostatistics (lange
publications)
 Internet

You might also like