You are on page 1of 59

COURSE TITLE:

APPLIED STATISTICS FOR


OPTOMETRISTS
Ms. Rabeea Samad
CHAPTER 8
ASSOCIATION

 Introduction
 Chi-Square Test for Independence
 Contingency Table
 Relative Risk
 Odds Ratio
 2 × 2 Contingency Table
 Fisher’s Exact Test
 Exercise
TEST OF INDEPENDENCE
 The use of chi-square distribution is to test the null
hypothesis* that two criteria of classification, when
applied to the same set of entities, are independent.
For Example:
➢ If socioeconomic status and area of residence of the
inhabitants of a certain city are independent.

➢ We would expect to find the same proportion of


families in the low, medium and high socioeconomic
groups in all areas of the city.

*A null hypothesis is a type of hypothesis used in statistics


which proposes that there is no difference between certain
characteristics of a population (or data-generating process). For
example, a gambler may be interested in whether a game of
chance is fair.
CONTINGENCY TABLE
❑ The classification, according to two criteria, of a set of
entities, say, people, can be shown by a table in
which
➢ 𝑟 rows represent the various levels of one criterion of
classification and
➢ 𝑐 columns represent the various levels of the second
criterion.

Such a table is generally called a 𝒄𝒐𝒏𝒕𝒊𝒏𝒈𝒆𝒏𝒄𝒚 𝒕𝒂𝒃𝒍𝒆.

 The classification according to two criteria of a finite


population of entities is shown in Table 12.4.1.
CONTINGENCY TABLE
CONTINGENCY TABLE
 The classification according to two criteria of a finite
sample of entities is shown in Table 12.4.2.
CALCULATION OF EXPECTED FREQUENCIES
 The expected frequency, under the null hypothesis that
the two criteria of classification are independent, is
calculated for each cell.
 In probability theory if two events 𝐴 and 𝐵 are
independent, is equal to the product of their individual
probabilities.
That is, 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 . 𝑃(𝐵)
➢ Under this assumption, a unit will fall in row 1 and
column 1 of the above table will be
𝑛1. 𝑛.1
𝑛 𝑛
➢ To get the expected frequency for cell 11, we multiply this
by the total 𝑛 so the expected frequency will be
𝒏𝟏. 𝒏.𝟏 𝒏𝟏. 𝒏.𝟏
𝑬𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 = 𝒏=
𝒏 𝒏 𝒏
SMALL EXPECTED FREQUENCIES
 The problem of small expected frequencies may be
encountered when analyzing data in contingency table.
➢ Earlier some writers suggest lower of 10, whereas other
suggest that all expected frequencies should be no less
than 5.

 Cochran suggested that for good frequency can be as low


as 1. If a person encounters one or more expected
frequencies less than 1, adjacent categories may be
combined to achieve the result.
➢ Combining reduces the number of categories and
therefore the number of degrees of freedom.
Note:
If χ2 is based on less than 30 degrees of freedom,
expected frequencies as small as can be tolerated.
OBSERVED VS. EXPECTED FREQUENCIES

❑ To test the null hypothesis that in the population/sample


the two criteria of classification are independent.

 The expected and observed frequencies are compared.

Note:
➢ If the difference is sufficiently small, the null hypothesis
is accepted.
➢ If the difference is sufficiently large, the null hypothesis
is rejected, and we conclude that the two criteria of
classification are not independent.
OBSERVED VS. EXPECTED FREQUENCIES
➢ The decision will be made on the basis of the size of
quantity as:
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ2 = ෍
𝑒𝑖
𝑖=1
➢ This follows χ2 − distribution with (𝑟 − 1)(𝑐 − 1)
degrees of freedom.
where
𝑜𝑖 represents the observed frequencies
𝑒𝑖 represents the expected frequencies

❑ The criteria of Hypothesis Testing is illustrated as follows.


TESTING PROCEDURE
(i) Hypothesis:
𝐻0 : The two criteria of classification are independent.
𝐻𝐴 : The two criteria of classification are dependent or
associated.
(ii) Level of Significance:
𝛼 = decides the level of significance*
(iii) Test Statistic: The test statistics is
𝑛 2
𝑜𝑖 − 𝑒 𝑖
χ2 = ෍
𝑒𝑖
𝑖=1
Under the null hypothesis it follows χ2 − distribution with
(𝑟 − 1)(𝑐 − 1) degrees of freedom.

*The significance level, also denoted as alpha or α, is the probability of


rejecting the null hypothesis when it is true. For example,
a significance level of 0.05 indicates a 5% risk of concluding that a
difference exists when there is no actual difference.
TESTING PROCEDURE
(iv) Calculation of test statistic:
-------------
(v) Statistical decision:
Reject 𝐻0 if χ𝑐 2 ≥ χ2 𝑡=𝛼, 𝑟−1 𝑐−1
where 𝛼: Level of Significance
𝑟: Number of rows
𝑐: Number of columns
(vi) Conclusion:
Reject 𝐻0 if calculated value of χ𝑐 2 is greater than
tabulated value of χ𝑡 2 .

➢ For the tabulated value of χ2 we need to search the value


with respect to that degree of freedom in the following
table:
EXAMPLE OF CONTINGENCY TABLE
Example: In 1992, the U.S. Public Health Service and the
Centers for Disease Control and Prevention recommended
that all women of childbearing age consume 400 mg of
folic acid daily to reduce the risk of having a pregnancy.
In a study by Stepanuk et al., 693 pregnant women called
a teratology information service about their use of folic
acid supplementation. The researchers wished to
determine if pre-conceptional use of folic acid and race
are independent. The data is given in following table:
Races Preconceptional Use of Folic Acid
Yes No Total
White 260 299 559
Black 15 41 56
Other 7 14 21
Total 282 354 636
EXAMPLE OF CONTINGENCY TABLE
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝐻0 : The preconceptional use of folic acid and race are
independent.
𝐻𝐴 : The preconceptional use of folic acid and race are
dependent.
(ii) Level of Significance:
𝛼 = 0.05
(iii) Test Statistics:
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ𝑐 2 = ෍
𝑒𝑖
𝑖=1
(iv) Calculation:
Compute the observed and expected frequencies.
EXAMPLE OF CONTINGENCY TABLE
Races Preconceptional Use of Folic Acid
Yes No Total
White 282 × 559 354 × 559 559
= 247.8584 = 311.1415
636 636
Black 282 × 56 354 × 56 56
= 24.8301 = 31.1698
636 636
Other 282 × 21 354 × 21 21
= 9.3113 = 11.6886
636 636
Total 282 354 636
𝟐
𝒐𝒊 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝟐 Τ𝒆𝒊
260 247.8584 147.4184 0.5947
15 24.8301 96.6308 3.8916
7 9.3113 5.3421 0.5737
299 311.1415 147.4160 0.4737
41 31.1698 96.6328 3.1002
14 11.6886 5.3425 0.4571
EXAMPLE OF CONTINGENCY TABLE
𝑛
𝑜𝑖 − 𝑒𝑖 2
χ𝑐 2 =෍ = 9.08960
𝑒𝑖
𝑖=1
(v) Statistical Decision:
Reject 𝐻0 if χ𝑐 2 ≥ χ2 𝑡=𝛼, 𝑟−1 𝑐−1
χ2 𝑡=𝛼, 𝑟−1 𝑐−1 = 5.991 (Table Value)
where
𝛼 = 0.05, 𝑟 = 3, and 𝑐=2
(vi) Conclusion:
Reject 𝐻0 /null hypothesis as 9.08960 ≥ 5.991 so
we conclude that there is a relationship between race and
preconceptional use of folic acid. Or race and
preconceptional use of folic acid are dependent.
THE 2 × 2 CONTINGENCY TABLE
 Sometimes each of two criteria of classification may be
broken down into only two categories, or levels.
➢ When data are cross-classified in this manner, the result
is a contingency table consisting of two rows and two
columns.
➢ Such a table is commonly referred to as a 2 × 2 table or
cross-tabulation.
Shortcut Formula:
In this case, χ2 may be calculated by the following
shortcut formula:
2
𝑛 𝑎𝑑 − 𝑏𝑐
χ2 =
𝑎+𝑐 𝑏+𝑑 𝑎+𝑏 𝑐+𝑑
where 𝑎, 𝑏, 𝑐, and 𝑑 are the observed cell frequencies as
shown in the following table.
THE 2 × 2 CONTINGENCY TABLE

Note:
➢ When we apply the 𝑟 − 1 𝑐 − 1 rule for finding
degrees of freedom to a 2 × 2 table, the result becomes 1
degree of freedom.
EXAMPLE OF THE 2 × 2 CONTINGENCY TABLE
Example: The falls are of major concern among polio
survivors. Researchers wanted to determine the impact of
a fall on lifestyle changes. Table 12.4.6 shows the results
of a study of 233 polio survivors on whether fear of
falling resulted in lifestyle changes. Construct a 2 × 2
contingency table to test the results.
EXAMPLE OF THE 2 × 2 CONTINGENCY TABLE
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝐻0 : Fall status and lifestyle change because of fear of
falling are independent.
𝐻𝐴 : Fall status and lifestyle change because of fear of
falling are dependent.
(ii) Level of Significance:
𝛼 = 0.05
(iii) Test Statistics:
𝑛 𝑎𝑑 − 𝑏𝑐 2
χ𝑐 2 =
𝑎+𝑐 𝑏+𝑑 𝑎+𝑏 𝑐+𝑑
(iv) Calculation:
Compute the observed and expected frequencies.
EXAMPLE OF THE 2 × 2 CONTINGENCY TABLE
2
2
233 131 36 − 52 14
χ𝑐 = = 31.7391
145 88 183 50
(v) Statistical Decision:
Reject 𝐻0 if χ𝑐 2 ≥ χ2 𝑡=𝛼, 𝑟−1 𝑐−1
χ2 𝑡=0.05,(2−1)(2−1) = 3.841 (Table Value)
where
𝛼 = 0.05, 𝑟 = 2, and 𝑐=2
(vi) Conclusion:
Reject 𝐻0 /null hypothesis as 31.7391 ≥ 3.841 so
we conclude that there is a relationship between Fall
status and lifestyle change because of fear of falling .
SMALL EXPECTED FREQUENCIES

 The problems of how to handle small expected


frequencies and small total sample sizes may arise in the
analysis of 2 × 2 contingency tables.
Note:
➢ Cochran suggested that the χ2 test should not be used if
𝑛 < 20 or if 20 < 𝑛 < 40 and any expected frequency is
less than 5.

➢ When 𝑛 = 40, an expected cell frequency as small as 1


can be tolerated.
RELATIVE RISK
➢ The data resulting from a prospective study in which the
dependent variable and the risk factor* are both
dichotomous may be displayed in a (2 × 2) contingency
table such as:

➢ *risk factor is used to designate a variable that is related to


some outcome variable. For example, the outcome variable
might be cancer while cigarette smoking might be the risk
factor with respect to that status.
RELATIVE RISK
Definition –
➢ Relative risk is the ratio of the risk of developing a disease
among subjects with the risk factor to the risk of
developing the disease among subjects without the risk
factor.

𝑎Τ 𝑎 + 𝑏
෢ =
𝑅𝑅
𝑐Τ 𝑐 + 𝑑

where 𝑅𝑅෢ indicates that the relative risk is computed from


a sample to be used as an estimate of the relative risk, RR,
for the population from which the sample was drawn.
INTERPRETATION OF RELATIVE RISK
❑ The value of Relative Risk may range anywhere from
zero to infinity.

➢ ෢ = 1 indicates that there is no association


A value of 𝑅𝑅
between risk factor and status of dependent variable.

➢ A value of 𝑅𝑅෢ > 1 indicates that the risk of acquiring the


disease is the greater for those subjects with the risk factor
and those without the risk factor.

➢ ෢ < 1 indicates that the risk of acquiring the


A value of 𝑅𝑅
disease is the less for those subjects with the risk factor
and those without the risk factor.
EXAMPLE OF RELATIVE RISK
Example: In a prospective study of pregnant women,
Magann et al., collected extensive information on exercise level
of low-risk pregnant working women. A group of 217 women did
no voluntary or mandatory exercise during the pregnancy, while a
group of 238 women exercised extensively. One outcome
variable of interest was experiencing preterm labor. The results
are summarized in Table 12.7.2. Calculate the relative risk of
preterm labor when pregnant women exercise extensively.
EXAMPLE OF RELATIVE RISK
Solution: By the Relative risk equation, we have
𝑎Τ 𝑎+𝑏

𝑅𝑅 = Τ
𝑐 𝑐+𝑑
22Τ 22+216

𝑅𝑅 = Τ
18 18+199
22Τ238
= Τ
18 217
.0924
= = 1.1
.0829
Interpretation:
෢ = 1.1 indicates that risk of experiencing
The value of 𝑅𝑅
preterm labor when a woman exercises heavily is 1.1
times as great as it is among women who do not exercise
at all.
ODDS RATIO
Definition –
❑ The odds for success, are the ratio of the probability of
success to the probability of failure.
➢ For this purpose we define two odds.

1. The odds of being a case (having the disease) to being a


control (not having a disease) among subjects with risk
factor is:
𝑎 Τ𝑎 + 𝑏 𝑎
𝑜𝑑𝑑𝑠 = =
𝑏 Τ𝑎 + 𝑏 𝑏
2. The odds of being a case (having the disease) to being a
control (not having a disease) among subjects without
risk factor is:
𝑐 Τ𝑐 + 𝑑 𝑐
𝑜𝑑𝑑𝑠 = =
𝑑 Τ𝑐 + 𝑑 𝑑
ODDS RATIO
So the formula for the odds ratio is:
𝑎ൗ 𝑎 𝑑
𝑏
𝑂𝑅 = 𝑐 = ×
ൗ𝑑 𝑏 𝑐
𝑎𝑑
𝑂𝑅 =
𝑏𝑐
where a, b, c, and d are defined in Table 12.7.3 as:
INTERPRETATION OF ODDS RATIO
❑ The Odds Ratio can assume values between zero and
infinity.

➢ A value of 𝑂𝑅 = 1 indicates that there is no association


between the risk factor and disease status.

➢ A value of 𝑂𝑅 > 1 indicates increase odds of having the


disease among subjects in whom the risk factor is present.

➢ A value of 𝑂𝑅 < 1 indicates reduced odds of the disease


among subjects with the risk factor.
EXAMPLE OF ODDS RATIO
Example: Toschke et al., collected data on obesity status of
children ages 5–6 years and the smoking status of the mother
during the pregnancy. Table 12.7.4 shows 3970 subjects classified
as cases or non-cases of obesity and also classified according to
smoking status of the mother during pregnancy (the risk factor).
We wish to compare the odds of obesity at ages 5–6 among those
whose mother smoked throughout the pregnancy with the odds of
obesity at age 5–6 among those whose mother did not smoke
during pregnancy.
EXAMPLE OF ODDS RATIO
Solution: By the Odds Ratio equation, we have
𝑎𝑑
𝑂𝑅 =
𝑏𝑐
223744
=
23256
= 9.62
Interpretation:
The value of OR = 9.62 indicates that obese children
(cases) are 9.62 times increased as likely as non-obese
children (non-cases) to have had a mother who smoked
throughout the pregnancy.
FISHER’S EXACT TEST
 Sometimes we have data that can be summarized
in a 2×2 contingency table, but these data are
derived from very small samples.

 The chi-square test is not an appropriate method


of analysis if minimum expected frequency
requirements are not met.

 If, for example, n is less than 20 or if n is between


20 and 40 and one of the expected frequencies is
less than 5, the chi-square test should be avoided.
FISHER’S EXACT TEST
 A test that may be used when the size
requirements of the chi-square test are not met
was proposed by Fisher, Irwin, and Yates.

 The test has come to be known as the Fisher


exact test.

 It is called exact because, if desired, it permits us


to calculate the exact probability of obtaining the
observed results or results that are more extreme.
FISHER’S EXACT TEST
 Assumptions –
The following are the assumptions for the Fisher
exact test.

1. The data consist of 𝑛1 sample observations from


population 1 and 𝑛2 sample observations from
population 2.

2. The samples are random and independent.

3. Each observation can be categorized as one of


two mutually exclusive types.
CONTINGENCY TABLE FOR FISHER
EXACT TEST
 The 2×2 contingency table for the fisher exact
test is shown in Table 12.6.1.
EXAMPLE – FISHER EXACT TEST
Example – 60 students were divided into two classes of 30
each and taught how to write a program for a computer.
One class used the conventional method of learning and
other class is used a new experimental method. At the end
of the course each student was given a test that consisted
of writing a program. The program was either correct or
incorrect and results were tabulated as follows:
Program
Class Correct Incorrect Total
Conventional 23 7 30
Experimental 27 3 30
Total 50 10 60

Is there reasons to believe the experimental method is


superior.
EXAMPLE – FISHER EXACT TEST
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝑝1 : Probability of correcting in conventional data.
𝑝2 : Probability of correcting in experimental data.

𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2
(ii) Level of Significance:
𝛼 = 0.05
(iii) Test Statistics:
𝑝 − 𝑣𝑎𝑙𝑢𝑒
where
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + ⋯ + 𝑝𝑘
𝑎𝑖 + 𝑏𝑖 ! 𝑐𝑖 + 𝑑𝑖 ! 𝑎𝑖 + 𝑐𝑖 ! 𝑏𝑖 + 𝑑𝑖 !
𝑝𝑖 =
𝑎𝑖 ! 𝑏𝑖 ! 𝑐𝑖 ! 𝑑𝑖 ! 𝑛!
EXAMPLE – FISHER EXACT TEST
(iv) Calculation of test statistic:
Program
Class Correct Incorrect Total
Conventional 23 𝑎𝑜 7 𝑏𝑜 30 𝑎𝑜 + 𝑏𝑜
Experimental 27 𝑐𝑜 3 𝑑𝑜 30 𝑐𝑜 + 𝑑𝑜
Total 50 𝑎𝑜 + 𝑐𝑜 10 𝑏𝑜 + 𝑑𝑜 60 𝑛
30 ! 30 ! 50 ! 10 !
𝑝𝑖 = =∞
23!7!27!3!60!
(v) Statistical decision:
Reject 𝐻0 if *𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
We cannot solve this question by Fisher exact test so we
apply chi-square test of homogeneity.
EXAMPLE – FISHER EXACT TEST

*𝒑 − 𝒗𝒂𝒍𝒖𝒆: In null hypothesis significance testing, the p-value


is the probability of obtaining test results at least as extreme as
the results actually observed, under the assumption that the null
hypothesis is correct.

OR

*𝒑 − 𝒗𝒂𝒍𝒖𝒆: A p-value is a measure of the probability that an


observed difference could have occurred just by random chance.
The lower the p-value, the greater the statistical significance of
the observed difference. P-value can be used as an alternative to
or in addition to pre-selected confidence levels for hypothesis
testing.
EXAMPLE – FISHER EXACT TEST
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝐻0 : Both populations are equally effective.
𝐻𝐴 : Both populations are not equally effective.
(ii) Level of Significance:
𝛼 = 0.05
(iii) Test Statistics:
𝑛
𝑜 − 𝑒 2
𝑖 𝑖
χ𝑐 2 = ෍
𝑒𝑖
𝑖=1
(iv) Calculation:
Compute the observed and expected frequencies.
EXAMPLE – FISHER EXACT TEST
Class Program
Correct Incorrect Total
Conventional 50 × 30 10 × 30 30
= 25 =5
60 60
Experimental 50 × 30 10 × 30 30
= 25 =5
60 60
Total 50 10 60

𝒐𝒊 𝒆𝒊 𝒐𝒊 − 𝒆𝒊 𝟐 𝒐𝒊 − 𝒆𝒊 𝟐 Τ𝒆𝒊
23 25 4 0.16
27 25 4 0.16
7 5 4 0.8
3 5 4 0.8
EXAMPLE – FISHER EXACT TEST
𝑛
𝑜𝑖 − 𝑒𝑖 2
χ𝑐 2 =෍ = 1.92
𝑒𝑖
𝑖=1
(v) Statistical Decision:
Reject 𝐻0 if χ𝑐 2 ≥ χ2 𝑡=𝛼, 𝑟−1 𝑐−1
χ2 𝑡=𝛼, 𝑟−1 𝑐−1 = 3.841 (Table Value)
where
𝛼 = 0.05, 𝑟 = 2, and 𝑐=2
(vi) Conclusion:
Accept 𝐻0 /null hypothesis as 1.92 ≥ 3.841 so
we conclude that both the populations are equally
effective.
EXAMPLE – FISHER EXACT TEST
Example – Following table contains results of a study
comparing radiation therapy with surgery in
treating cancer. Use Fisher Exact Test to test the
hypothesis that surgery is better treatment:
Cancer
Treatment Controlled Not-Controlled Total
Surgery 21 2 23
Radiation
15 3 18
Therapy
Total 36 5 41
EXAMPLE – FISHER EXACT TEST
Solution: Assumption: Sample is selected randomly
from the population.
(i) Hypothesis:
𝑝1 : Probability of cancer control among the patient treated
with surgery.
𝑝2 : Probability of cancer control among the patient treated
with radiation therapy.
𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 > 𝑝2
(ii) Level of Significance:
𝛼 = 0.05
(iii) Test Statistics:
𝑝 − 𝑣𝑎𝑙𝑢𝑒
where
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + ⋯ + 𝑝𝑘
𝑎𝑖 + 𝑏𝑖 ! 𝑐𝑖 + 𝑑𝑖 ! 𝑎𝑖 + 𝑐𝑖 ! 𝑏𝑖 + 𝑑𝑖 !
𝑝𝑖 =
𝑎𝑖 ! 𝑏𝑖 ! 𝑐𝑖 ! 𝑑𝑖 ! 𝑛!
EXAMPLE – FISHER EXACT TEST
(iv) Calculation of test statistic:

Cancer
Treatment Controlled Not-Controlled Total
Surgery 21 𝑎𝑜 2 𝑏𝑜 23 𝑎𝑜 + 𝑏𝑜
Radiation Therapy 15 𝑐𝑜 3 𝑑𝑜 18 𝑐𝑜 + 𝑑𝑜
Total 36 𝑎𝑜 + 𝑐𝑜 5 𝑏𝑜 + 𝑑𝑜 41 𝑛

23 ! 18 ! 36 ! 5 !
𝑝𝑜 = = 0.2754
2!2!15!3!41!
Now to calculate 𝑝1 the 2×2 contingency table will be
adjusted as:
EXAMPLE – FISHER EXACT TEST
Cancer
Treatment Controlled Not-Controlled Total
Surgery 22 𝑎𝑜 1 𝑏𝑜 23 𝑎𝑜 + 𝑏𝑜
Radiation Therapy 14 𝑐𝑜 4 𝑑𝑜 18 𝑐𝑜 + 𝑑𝑜
Total 36 𝑎𝑜 + 𝑐𝑜 5 𝑏𝑜 + 𝑑𝑜 41 𝑛

23 ! 18 ! 36 ! 5 !
𝑝1 = = 0.09391
22!1!14!4!41!
Now to calculate 𝑝2 the 2×2 contingency table will be
adjusted as:
EXAMPLE – FISHER EXACT TEST
Cancer

Treatment Controlled Not-Controlled Total

Surgery 23 𝑎𝑜 0 𝑏𝑜 23 𝑎𝑜 + 𝑏𝑜

Radiation Therapy 13 𝑐𝑜 5 𝑑𝑜 18 𝑐𝑜 + 𝑑𝑜

Total 36 𝑎𝑜 + 𝑐𝑜 5 𝑏𝑜 + 𝑑𝑜 41 𝑛
23 ! 18 ! 36 ! 5 !
𝑝2 = = 0.0114
23!0!13!5!41!
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑝𝑜 + 𝑝1 + 𝑝2
= 0.2754 + 0.09391 + 0.0114 = 0.38071
(v) Statistical decision:
Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
0.38071 ≤ 0.05
(vi) Conclusion:
Accept 𝐻0 /null hypothesis as 0.38071 ≤ 0.05 so we conclude
that probability of cancer control among the patient treated
with surgery is better treatment.
EXERCISE (NUMERICAL QUESTIONS)
Q-1: In the study by Silver and Aiello, a secondary objective
was to determine if the frequency of falls was independent
of wheelchair use. The following table gives the data for
falls and wheelchair use among the subjects of the study.

Do these data provide sufficient evidence to warrant the


conclusion that wheelchair use and falling are related? Let
𝛼 = 0.05.
EXERCISE (NUMERICAL QUESTIONS)
Q-2: A study was conducted by Rothenberg and Holcomb to
determine if physicians taking part in a national database of
computerized medical records performed the recommended
baseline tests when prescribing non-steroidal anti-inflammatory
drugs (NSAIDs). The researchers classified physicians in the
study into four categories—those practicing in internal
medicine, family practice, academic family practice, and
multispecialty groups. The data appear in the following table.

Do the data above provide sufficient evidence for us to


conclude that type of practice and performance of baseline tests
are related? Use 𝛼 = 0.01.
EXERCISE (NUMERICAL QUESTIONS)
Q-3: Davy et al., reported the results of a study involving
survival from cervical cancer. The researchers found that
among subjects younger than age 50, 16 of 371 subjects
had not survived for 1 year after diagnosis. In subjects age
50 or older, 219 of 376 had not survived for 1 year after
diagnosis. Compute the relative risk of death among
subjects age 50 or older.
EXERCISE (NUMERICAL QUESTIONS)
Q-4: Toschke et al. reported on another outcome variable:
whether the child was born premature (37 weeks or fewer
of gestation). The following table summarizes the results
of this aspect of the study. The same risk factor (smoking
during pregnancy) is considered, but a case is now
defined as a mother who gave birth prematurely.

Compute the odds ratio to determine if smoking throughout


pregnancy is related to premature birth. Use the chi-square
test of independence to determine if one may conclude that
there is an association between smoking throughout
pregnancy and premature birth. Let 𝛼 = 0.05.
EXERCISE (NUMERICAL QUESTIONS)
Q-5: In a study by Xiao and Shi, researchers studied the effect
of cranberry juice in the treatment and prevention of
Helicobacter pylori infection in mice. The eradication of
Helicobacter pylori results in the healing of peptic ulcers.
Researchers compared treatment with cranberry juice to
“triple therapy (amoxicillin, bismuth sub-citrate, and
metronidazole) in mice infected with Helicobacter pylori.
After 4 weeks, they examined the mice to determine the
frequency of eradication of the bacterium in the two
treatment groups. The following table shows the results.

May we conclude, on the basis of these data, that triple


therapy is more effective than cranberry juice at eradication
of the bacterium? Let 𝛼 = 0.05 and find the 𝑝 − 𝑣𝑎𝑙𝑢𝑒.

You might also like