Chapter#8 Association

COURSE TITLE:
APPLIED STATISTICS FOR

OPTOMETRISTS
Ms. Rabeea Samad
CHAPTER 8
ASSOCIATION
 Introduction
 Chi-Square Test for Independence
 Contingency Table
 Relative Risk
 Odds Ratio
 2 × 2 Contingency Table
 Fisher’s Exact Test
 Exercise
TEST OF INDEPENDENCE
 The use of chi-square distribution is to test the null
hypothesis* that two criteria of classification, when
applied to the same set of entities, are independent.
For Example:
➢ If socioeconomic status and area of residence of the
inhabitants of a certain city are independent.
➢ We would expect to find the same proportion of

families in the low, medium and high socioeconomic
groups in all areas of the city.
*A null hypothesis is a type of hypothesis used in statistics

which proposes that there is no difference between certain
characteristics of a population (or data-generating process). For
example, a gambler may be interested in whether a game of
chance is fair.
CONTINGENCY TABLE
❑ The classification, according to two criteria, of a set of
entities, say, people, can be shown by a table in
which
➢ 𝑟 rows represent the various levels of one criterion of
classification and
➢ 𝑐 columns represent the various levels of the second
criterion.
Such a table is generally called a 𝒄𝒐𝒏𝒕𝒊𝒏𝒈𝒆𝒏𝒄𝒚 𝒕𝒂𝒃𝒍𝒆.
 The classification according to two criteria of a finite

population of entities is shown in Table 12.4.1.
CONTINGENCY TABLE
CONTINGENCY TABLE
 The classification according to two criteria of a finite
sample of entities is shown in Table 12.4.2.
CALCULATION OF EXPECTED FREQUENCIES
 The expected frequency, under the null hypothesis that
the two criteria of classification are independent, is
calculated for each cell.
 In probability theory if two events 𝐴 and 𝐵 are
independent, is equal to the product of their individual
probabilities.
That is, 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 . 𝑃(𝐵)
➢ Under this assumption, a unit will fall in row 1 and
column 1 of the above table will be
𝑛1. 𝑛.1
𝑛 𝑛
➢ To get the expected frequency for cell 11, we multiply this
by the total 𝑛 so the expected frequency will be
𝒏𝟏. 𝒏.𝟏 𝒏𝟏. 𝒏.𝟏
𝑬𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 = 𝒏=
𝒏 𝒏 𝒏
SMALL EXPECTED FREQUENCIES
 The problem of small expected frequencies may be
encountered when analyzing data in contingency table.
➢ Earlier some writers suggest lower of 10, whereas other
suggest that all expected frequencies should be no less
than 5.
 Cochran suggested that for good frequency can be as low

as 1. If a person encounters one or more expected
frequencies less than 1, adjacent categories may be
combined to achieve the result.
➢ Combining reduces the number of categories and
therefore the number of degrees of freedom.
Note:
If χ2 is based on less than 30 degrees of freedom,
expected frequencies as small as can be tolerated.
OBSERVED VS. EXPECTED FREQUENCIES
❑ To test the null hypothesis that in the population/sample

the two criteria of classification are independent.
 The expected and observed frequencies are compared.
Note:
➢ If the difference is sufficiently small, the null hypothesis
is accepted.
➢ If the difference is sufficiently large, the null hypothesis
is rejected, and we conclude that the two criteria of
classification are not independent.
OBSERVED VS. EXPECTED FREQUENCIES
➢ The decision will be made on the basis of the size of
quantity as:
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ2 = ෍
𝑒𝑖
𝑖=1
➢ This follows χ2 − distribution with (𝑟 − 1)(𝑐 − 1)
degrees of freedom.
where
𝑜𝑖 represents the observed frequencies
𝑒𝑖 represents the expected frequencies
❑ The criteria of Hypothesis Testing is illustrated as follows.

TESTING PROCEDURE
(i) Hypothesis:
𝐻0 : The two criteria of classification are independent.
𝐻𝐴 : The two criteria of classification are dependent or
associated.
(ii) Level of Significance:
𝛼 = decides the level of significance*
(iii) Test Statistic: The test statistics is
𝑛 2
𝑜𝑖 − 𝑒 𝑖
χ2 = ෍
𝑒𝑖
𝑖=1
Under the null hypothesis it follows χ2 − distribution with
(𝑟 − 1)(𝑐 − 1) degrees of freedom.
*The significance level, also denoted as alpha or α, is the probability of

rejecting the null hypothesis when it is true. For example,
a significance level of 0.05 indicates a 5% risk of concluding that a
difference exists when there is no actual difference.
TESTING PROCEDURE
(iv) Calculation of test statistic:
-------------
(v) Statistical decision:
Reject 𝐻0 if χ𝑐 2 ≥ χ2 𝑡=𝛼, 𝑟−1 𝑐−1
where 𝛼: Level of Significance
𝑟: Number of rows
𝑐: Number of columns
(vi) Conclusion:
Reject 𝐻0 if calculated value of χ𝑐 2 is greater than
tabulated value of χ𝑡 2 .
➢ For the tabulated value of χ2 we need to search the value

with respect to that degree of freedom in the following
table:
EXAMPLE OF CONTINGENCY TABLE
Example: In 1992, the U.S. Public Health Service and the
Centers for Disease Control and Prevention recommended
that all women of childbearing age consume 400 mg of
folic acid daily to reduce the risk of having a pregnancy.
In a study by Stepanuk et al., 693 pregnant women called
a teratology information service about their use of folic
acid supplementation. The researchers wished to
determine if pre-conceptional use of folic acid and race
are independent. The data is given in following table:
Races Preconceptional Use of Folic Acid
Yes No Total
White 260 299 559
Black 15 41 56
Other 7 14 21
Total 282 354 636
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝐻0 : The preconceptional use of folic acid and race are
independent.
𝐻𝐴 : The preconceptional use of folic acid and race are
dependent.
𝛼 = 0.05
(iii) Test Statistics:
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ𝑐 2 = ෍
𝑒𝑖
𝑖=1
(iv) Calculation:
Compute the observed and expected frequencies.
Races Preconceptional Use of Folic Acid
Yes No Total
White 282 × 559 354 × 559 559
= 247.8584 = 311.1415
636 636
Black 282 × 56 354 × 56 56
= 24.8301 = 31.1698
636 636
Other 282 × 21 354 × 21 21
= 9.3113 = 11.6886
636 636
Total 282 354 636
𝟐
𝒐𝒊 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝟐 Τ𝒆𝒊
260 247.8584 147.4184 0.5947
15 24.8301 96.6308 3.8916
7 9.3113 5.3421 0.5737
299 311.1415 147.4160 0.4737
41 31.1698 96.6328 3.1002
14 11.6886 5.3425 0.4571
𝑛
𝑜𝑖 − 𝑒𝑖 2
χ𝑐 2 =෍ = 9.08960
𝑒𝑖
𝑖=1
(v) Statistical Decision:
χ2 𝑡=𝛼, 𝑟−1 𝑐−1 = 5.991 (Table Value)
where
𝛼 = 0.05, 𝑟 = 3, and 𝑐=2
(vi) Conclusion:
Reject 𝐻0 /null hypothesis as 9.08960 ≥ 5.991 so
we conclude that there is a relationship between race and
preconceptional use of folic acid. Or race and
preconceptional use of folic acid are dependent.
THE 2 × 2 CONTINGENCY TABLE
 Sometimes each of two criteria of classification may be
broken down into only two categories, or levels.
➢ When data are cross-classified in this manner, the result
is a contingency table consisting of two rows and two
columns.
➢ Such a table is commonly referred to as a 2 × 2 table or
cross-tabulation.
Shortcut Formula:
In this case, χ2 may be calculated by the following
shortcut formula:
2
𝑛 𝑎𝑑 − 𝑏𝑐
χ2 =
𝑎+𝑐 𝑏+𝑑 𝑎+𝑏 𝑐+𝑑
where 𝑎, 𝑏, 𝑐, and 𝑑 are the observed cell frequencies as
shown in the following table.
THE 2 × 2 CONTINGENCY TABLE
Note:
➢ When we apply the 𝑟 − 1 𝑐 − 1 rule for finding
degrees of freedom to a 2 × 2 table, the result becomes 1
degree of freedom.
EXAMPLE OF THE 2 × 2 CONTINGENCY TABLE
Example: The falls are of major concern among polio
survivors. Researchers wanted to determine the impact of
a fall on lifestyle changes. Table 12.4.6 shows the results
of a study of 233 polio survivors on whether fear of
falling resulted in lifestyle changes. Construct a 2 × 2
contingency table to test the results.
(i) Hypothesis:
𝐻0 : Fall status and lifestyle change because of fear of
falling are independent.
𝐻𝐴 : Fall status and lifestyle change because of fear of
falling are dependent.
𝛼 = 0.05
𝑛 𝑎𝑑 − 𝑏𝑐 2
χ𝑐 2 =
𝑎+𝑐 𝑏+𝑑 𝑎+𝑏 𝑐+𝑑
(iv) Calculation:
2
2
233 131 36 − 52 14
χ𝑐 = = 31.7391
145 88 183 50
χ2 𝑡=0.05,(2−1)(2−1) = 3.841 (Table Value)
where
𝛼 = 0.05, 𝑟 = 2, and 𝑐=2
(vi) Conclusion:
Reject 𝐻0 /null hypothesis as 31.7391 ≥ 3.841 so
we conclude that there is a relationship between Fall
status and lifestyle change because of fear of falling .
SMALL EXPECTED FREQUENCIES
 The problems of how to handle small expected

frequencies and small total sample sizes may arise in the
analysis of 2 × 2 contingency tables.
Note:
➢ Cochran suggested that the χ2 test should not be used if
𝑛 < 20 or if 20 < 𝑛 < 40 and any expected frequency is
less than 5.
➢ When 𝑛 = 40, an expected cell frequency as small as 1

can be tolerated.
RELATIVE RISK
➢ The data resulting from a prospective study in which the
dependent variable and the risk factor* are both
dichotomous may be displayed in a (2 × 2) contingency
table such as:
➢ *risk factor is used to designate a variable that is related to

some outcome variable. For example, the outcome variable
might be cancer while cigarette smoking might be the risk
factor with respect to that status.
RELATIVE RISK
Definition –
➢ Relative risk is the ratio of the risk of developing a disease
among subjects with the risk factor to the risk of
developing the disease among subjects without the risk
factor.
𝑎Τ 𝑎 + 𝑏
෢ =
𝑅𝑅
𝑐Τ 𝑐 + 𝑑
where 𝑅𝑅෢ indicates that the relative risk is computed from

a sample to be used as an estimate of the relative risk, RR,
for the population from which the sample was drawn.
INTERPRETATION OF RELATIVE RISK
❑ The value of Relative Risk may range anywhere from
zero to infinity.
➢ ෢ = 1 indicates that there is no association

A value of 𝑅𝑅
between risk factor and status of dependent variable.
➢ A value of 𝑅𝑅෢ > 1 indicates that the risk of acquiring the

disease is the greater for those subjects with the risk factor
and those without the risk factor.
➢ ෢ < 1 indicates that the risk of acquiring the

A value of 𝑅𝑅
disease is the less for those subjects with the risk factor
and those without the risk factor.
EXAMPLE OF RELATIVE RISK
Example: In a prospective study of pregnant women,
Magann et al., collected extensive information on exercise level
of low-risk pregnant working women. A group of 217 women did
no voluntary or mandatory exercise during the pregnancy, while a
group of 238 women exercised extensively. One outcome
variable of interest was experiencing preterm labor. The results
are summarized in Table 12.7.2. Calculate the relative risk of
preterm labor when pregnant women exercise extensively.
EXAMPLE OF RELATIVE RISK
Solution: By the Relative risk equation, we have
𝑎Τ 𝑎+𝑏
෢
𝑅𝑅 = Τ
𝑐 𝑐+𝑑
22Τ 22+216
෢
𝑅𝑅 = Τ
18 18+199
22Τ238
= Τ
18 217
.0924
= = 1.1
.0829
Interpretation:
෢ = 1.1 indicates that risk of experiencing
The value of 𝑅𝑅
preterm labor when a woman exercises heavily is 1.1
times as great as it is among women who do not exercise
at all.
ODDS RATIO
Definition –
❑ The odds for success, are the ratio of the probability of
success to the probability of failure.
➢ For this purpose we define two odds.
1. The odds of being a case (having the disease) to being a

control (not having a disease) among subjects with risk
factor is:
𝑎 Τ𝑎 + 𝑏 𝑎
𝑜𝑑𝑑𝑠 = =
𝑏 Τ𝑎 + 𝑏 𝑏
2. The odds of being a case (having the disease) to being a
control (not having a disease) among subjects without
risk factor is:
𝑐 Τ𝑐 + 𝑑 𝑐
𝑜𝑑𝑑𝑠 = =
𝑑 Τ𝑐 + 𝑑 𝑑
ODDS RATIO
So the formula for the odds ratio is:
𝑎ൗ 𝑎 𝑑
𝑏
𝑂𝑅 = 𝑐 = ×
ൗ𝑑 𝑏 𝑐
𝑎𝑑
𝑂𝑅 =
𝑏𝑐
where a, b, c, and d are defined in Table 12.7.3 as:
INTERPRETATION OF ODDS RATIO
❑ The Odds Ratio can assume values between zero and
infinity.
➢ A value of 𝑂𝑅 = 1 indicates that there is no association

between the risk factor and disease status.
➢ A value of 𝑂𝑅 > 1 indicates increase odds of having the

disease among subjects in whom the risk factor is present.
➢ A value of 𝑂𝑅 < 1 indicates reduced odds of the disease

among subjects with the risk factor.
EXAMPLE OF ODDS RATIO
Example: Toschke et al., collected data on obesity status of
children ages 5–6 years and the smoking status of the mother
during the pregnancy. Table 12.7.4 shows 3970 subjects classified
as cases or non-cases of obesity and also classified according to
smoking status of the mother during pregnancy (the risk factor).
We wish to compare the odds of obesity at ages 5–6 among those
whose mother smoked throughout the pregnancy with the odds of
obesity at age 5–6 among those whose mother did not smoke
during pregnancy.
EXAMPLE OF ODDS RATIO
Solution: By the Odds Ratio equation, we have
𝑎𝑑
𝑂𝑅 =
𝑏𝑐
223744
=
23256
= 9.62
Interpretation:
The value of OR = 9.62 indicates that obese children
(cases) are 9.62 times increased as likely as non-obese
children (non-cases) to have had a mother who smoked
throughout the pregnancy.
FISHER’S EXACT TEST
 Sometimes we have data that can be summarized
in a 2×2 contingency table, but these data are
derived from very small samples.
 The chi-square test is not an appropriate method

of analysis if minimum expected frequency
requirements are not met.
 If, for example, n is less than 20 or if n is between

20 and 40 and one of the expected frequencies is
less than 5, the chi-square test should be avoided.
 A test that may be used when the size
requirements of the chi-square test are not met
was proposed by Fisher, Irwin, and Yates.
 The test has come to be known as the Fisher

exact test.
 It is called exact because, if desired, it permits us

to calculate the exact probability of obtaining the
observed results or results that are more extreme.
 Assumptions –
The following are the assumptions for the Fisher
exact test.
1. The data consist of 𝑛1 sample observations from

population 1 and 𝑛2 sample observations from
population 2.
2. The samples are random and independent.
3. Each observation can be categorized as one of

two mutually exclusive types.
CONTINGENCY TABLE FOR FISHER
EXACT TEST
 The 2×2 contingency table for the fisher exact
test is shown in Table 12.6.1.
EXAMPLE – FISHER EXACT TEST
Example – 60 students were divided into two classes of 30
each and taught how to write a program for a computer.
One class used the conventional method of learning and
other class is used a new experimental method. At the end
of the course each student was given a test that consisted
of writing a program. The program was either correct or
incorrect and results were tabulated as follows:
Program
Class Correct Incorrect Total
Conventional 23 7 30
Experimental 27 3 30
Total 50 10 60
Is there reasons to believe the experimental method is

superior.
(i) Hypothesis:
𝑝1 : Probability of correcting in conventional data.
𝑝2 : Probability of correcting in experimental data.
𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2
𝛼 = 0.05
𝑝 − 𝑣𝑎𝑙𝑢𝑒
where
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + ⋯ + 𝑝𝑘
𝑎𝑖 + 𝑏𝑖 ! 𝑐𝑖 + 𝑑𝑖 ! 𝑎𝑖 + 𝑐𝑖 ! 𝑏𝑖 + 𝑑𝑖 !
𝑝𝑖 =
𝑎𝑖 ! 𝑏𝑖 ! 𝑐𝑖 ! 𝑑𝑖 ! 𝑛!
Program
Class Correct Incorrect Total
Conventional 23 𝑎𝑜 7 𝑏𝑜 30 𝑎𝑜 + 𝑏𝑜
Experimental 27 𝑐𝑜 3 𝑑𝑜 30 𝑐𝑜 + 𝑑𝑜
Total 50 𝑎𝑜 + 𝑐𝑜 10 𝑏𝑜 + 𝑑𝑜 60 𝑛
30 ! 30 ! 50 ! 10 !
𝑝𝑖 = =∞
23!7!27!3!60!
Reject 𝐻0 if *𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
We cannot solve this question by Fisher exact test so we
apply chi-square test of homogeneity.
*𝒑 − 𝒗𝒂𝒍𝒖𝒆: In null hypothesis significance testing, the p-value

is the probability of obtaining test results at least as extreme as
the results actually observed, under the assumption that the null
hypothesis is correct.
OR
*𝒑 − 𝒗𝒂𝒍𝒖𝒆: A p-value is a measure of the probability that an

observed difference could have occurred just by random chance.
The lower the p-value, the greater the statistical significance of
the observed difference. P-value can be used as an alternative to
or in addition to pre-selected confidence levels for hypothesis
testing.
(i) Hypothesis:
𝐻0 : Both populations are equally effective.
𝐻𝐴 : Both populations are not equally effective.
𝛼 = 0.05
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ𝑐 2 = ෍
𝑒𝑖
𝑖=1
(iv) Calculation:
Class Program
Correct Incorrect Total
Conventional 50 × 30 10 × 30 30
= 25 =5
60 60
Experimental 50 × 30 10 × 30 30
= 25 =5
60 60
Total 50 10 60
𝒐𝒊 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝟐 𝒐𝒊 − 𝒆𝒊 𝟐 Τ𝒆𝒊
23 25 4 0.16
27 25 4 0.16
7 5 4 0.8
3 5 4 0.8
𝑛
𝑜𝑖 − 𝑒𝑖 2
χ𝑐 2 =෍ = 1.92
𝑒𝑖
𝑖=1
χ2 𝑡=𝛼, 𝑟−1 𝑐−1 = 3.841 (Table Value)
where
𝛼 = 0.05, 𝑟 = 2, and 𝑐=2
(vi) Conclusion:
Accept 𝐻0 /null hypothesis as 1.92 ≥ 3.841 so
we conclude that both the populations are equally
effective.
Example – Following table contains results of a study
comparing radiation therapy with surgery in
treating cancer. Use Fisher Exact Test to test the
hypothesis that surgery is better treatment:
Cancer
Treatment Controlled Not-Controlled Total
Surgery 21 2 23
Radiation
15 3 18
Therapy
Total 36 5 41
(i) Hypothesis:
𝑝1 : Probability of cancer control among the patient treated
with surgery.
𝑝2 : Probability of cancer control among the patient treated
with radiation therapy.
𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 > 𝑝2
𝛼 = 0.05
𝑝 − 𝑣𝑎𝑙𝑢𝑒
where
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + ⋯ + 𝑝𝑘
𝑎𝑖 + 𝑏𝑖 ! 𝑐𝑖 + 𝑑𝑖 ! 𝑎𝑖 + 𝑐𝑖 ! 𝑏𝑖 + 𝑑𝑖 !
𝑝𝑖 =
𝑎𝑖 ! 𝑏𝑖 ! 𝑐𝑖 ! 𝑑𝑖 ! 𝑛!
Cancer
Surgery 21 𝑎𝑜 2 𝑏𝑜 23 𝑎𝑜 + 𝑏𝑜
Radiation Therapy 15 𝑐𝑜 3 𝑑𝑜 18 𝑐𝑜 + 𝑑𝑜
23 ! 18 ! 36 ! 5 !
𝑝𝑜 = = 0.2754
2!2!15!3!41!
Now to calculate 𝑝1 the 2×2 contingency table will be
adjusted as:
Cancer
23 ! 18 ! 36 ! 5 !
𝑝1 = = 0.09391
22!1!14!4!41!
Now to calculate 𝑝2 the 2×2 contingency table will be
adjusted as:
Cancer
23 ! 18 ! 36 ! 5 !
𝑝2 = = 0.0114
23!0!13!5!41!
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + 𝑝2
= 0.2754 + 0.09391 + 0.0114 = 0.38071
Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
0.38071 ≤ 0.05
(vi) Conclusion:
Accept 𝐻0 /null hypothesis as 0.38071 ≤ 0.05 so we conclude
that probability of cancer control among the patient treated
with surgery is better treatment.
EXERCISE (NUMERICAL QUESTIONS)
Q-1: In the study by Silver and Aiello, a secondary objective
was to determine if the frequency of falls was independent
of wheelchair use. The following table gives the data for
falls and wheelchair use among the subjects of the study.
Do these data provide sufficient evidence to warrant the

conclusion that wheelchair use and falling are related? Let
𝛼 = 0.05.
Q-2: A study was conducted by Rothenberg and Holcomb to
determine if physicians taking part in a national database of
computerized medical records performed the recommended
baseline tests when prescribing non-steroidal anti-inflammatory
drugs (NSAIDs). The researchers classified physicians in the
study into four categories—those practicing in internal
medicine, family practice, academic family practice, and
multispecialty groups. The data appear in the following table.
Do the data above provide sufficient evidence for us to

conclude that type of practice and performance of baseline tests
are related? Use 𝛼 = 0.01.
Q-3: Davy et al., reported the results of a study involving
survival from cervical cancer. The researchers found that
among subjects younger than age 50, 16 of 371 subjects
had not survived for 1 year after diagnosis. In subjects age
50 or older, 219 of 376 had not survived for 1 year after
diagnosis. Compute the relative risk of death among
subjects age 50 or older.
Q-4: Toschke et al. reported on another outcome variable:
whether the child was born premature (37 weeks or fewer
of gestation). The following table summarizes the results
of this aspect of the study. The same risk factor (smoking
during pregnancy) is considered, but a case is now
defined as a mother who gave birth prematurely.
Compute the odds ratio to determine if smoking throughout

pregnancy is related to premature birth. Use the chi-square
test of independence to determine if one may conclude that
there is an association between smoking throughout
pregnancy and premature birth. Let 𝛼 = 0.05.
Q-5: In a study by Xiao and Shi, researchers studied the effect
of cranberry juice in the treatment and prevention of
Helicobacter pylori infection in mice. The eradication of
Helicobacter pylori results in the healing of peptic ulcers.
Researchers compared treatment with cranberry juice to
“triple therapy (amoxicillin, bismuth sub-citrate, and
metronidazole) in mice infected with Helicobacter pylori.
After 4 weeks, they examined the mice to determine the
frequency of eradication of the bacterium in the two
treatment groups. The following table shows the results.
May we conclude, on the basis of these data, that triple

therapy is more effective than cranberry juice at eradication
of the bacterium? Let 𝛼 = 0.05 and find the 𝑝 − 𝑣𝑎𝑙𝑢𝑒.

Chapter#8 Association

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter#8 Association

Uploaded by

Copyright:

Available Formats

COURSE TITLE:

APPLIED STATISTICS FOR

➢ We would expect to find the same proportion of

*A null hypothesis is a type of hypothesis used in statistics

Such a table is generally called a 𝒄𝒐𝒏𝒕𝒊𝒏𝒈𝒆𝒏𝒄𝒚 𝒕𝒂𝒃𝒍𝒆.

 The classification according to two criteria of a finite

 Cochran suggested that for good frequency can be as low

❑ To test the null hypothesis that in the population/sample

 The expected and observed frequencies are compared.

❑ The criteria of Hypothesis Testing is illustrated as follows.

*The significance level, also denoted as alpha or α, is the probability of

➢ For the tabulated value of χ2 we need to search the value

 The problems of how to handle small expected

➢ When 𝑛 = 40, an expected cell frequency as small as 1

➢ *risk factor is used to designate a variable that is related to

where 𝑅𝑅෢ indicates that the relative risk is computed from

➢ ෢ = 1 indicates that there is no association

➢ A value of 𝑅𝑅෢ > 1 indicates that the risk of acquiring the

➢ ෢ < 1 indicates that the risk of acquiring the

1. The odds of being a case (having the disease) to being a

➢ A value of 𝑂𝑅 = 1 indicates that there is no association

➢ A value of 𝑂𝑅 > 1 indicates increase odds of having the

➢ A value of 𝑂𝑅 < 1 indicates reduced odds of the disease

 The chi-square test is not an appropriate method

 If, for example, n is less than 20 or if n is between

 The test has come to be known as the Fisher

 It is called exact because, if desired, it permits us

1. The data consist of 𝑛1 sample observations from

2. The samples are random and independent.

3. Each observation can be categorized as one of

Is there reasons to believe the experimental method is

*𝒑 − 𝒗𝒂𝒍𝒖𝒆: In null hypothesis significance testing, the p-value

*𝒑 − 𝒗𝒂𝒍𝒖𝒆: A p-value is a measure of the probability that an

Treatment Controlled Not-Controlled Total

Do these data provide sufficient evidence to warrant the

Do the data above provide sufficient evidence for us to

Compute the odds ratio to determine if smoking throughout

May we conclude, on the basis of these data, that triple

You might also like