You are on page 1of 9

EXAMPLE

The table below displays information for 4304 individuals, 18 years or older, interviewed during
an illegal drug use, tobacco, and alcohol survey. The respondents have been classified according
to their answer to the following questions:
• Have you smoked more than 100 cigarettes in your life? If the answer is ‘YES’ the individual is
classified as an smoker’ (could be former or current smoker), if the answer is ‘NO’ the individual
is classified as ‘Never a smoker’.
• Have you used marijuana at least once in your life?

Has ever used Marijuana TOTAL


Has ever been a YES NO
Smoker YES 722 1207 1929
NO 397 1978 2375
TOTAL 1119 3185 4304
722 1119
P(S∩M )= =0. 16775 P( M )= =0 .25999
4304 , 4304

722 722
P( M /S )= =0 . 37428 P(S / M )= =0 . 6452
1929 , 1119

False positives and false negatives

HIV is the virus that causes AIDS, and ELISA (Enzyme Linked Immuno Sorbent Assay) is a test
for HIV. A table for hypothetical one million individuals for the ELISA test was constructed by
Rossman and Short (1995).
Table 1.
Test Positive Test Negative TOTAL
Carry HIV 4885 115 5000
No HIV 73630 921370 995000
TOTAL 78515 921485 1000000

1. False Negative. There are 5000 individuals who carry HIV, 115 of
them test negative for HIV. Thus
P(false negative)=P(test -/carry HIV)=115/5000=0.023
2. False Positive. There are 999500 individuals who do not carry
HIV, however 73630 of them test positive. Thus
P(false positive)= P(test +/ No HIV)=
73630/999500=0.07366683
3. There are 4885+73630=78515 individuals who tested positive but
only 4885 of them have HIV, thus
P(HIV/test+)=4885/78515=0.06221741.
This might sound surprising but we should remember that only a
small
portion of the population carries the HIV.
4. There are 921485 individuals who tested negative but 115 of them
carry HIV, thus
P(HIV/test-) =115/921485=0.0001247986
5. What is the probability of testing positive if one carries HIV?
4885/5000 = 0.977
The conditional probability P(test+/has the virus) reflects how good the
test is in detecting the virus when the virus is present; in the medical
world this conditional probability is called sensitivity of a test.
The probability of a false negative is 115/5000, notice that the sensitivity
of the test and the probability of a false negative add up to 1. Indeed,
sensitivity=1-P (false negative)
What is the probability of testing negative if one does not carry HIV?
921370/995000 = 0.926
The conditional probability P(test-/one does not carry the virus) reflects
how good the test is in
distinguishing those who do not have the virus and is called the
specificity
of the test.
specificity=1-P (false positive).

Cross Tabular analysis

Examples:

A beverage factory manufactures and distributes three types of beverage based on the level of
Nicotine content, coffee, tea and cocoa. The market researcher has raised questions concerning
differences in preferences for the three varieties among male and female. A sample of 150
persons aged between 15-25 years was selected. After testing each of the prepared varieties the
individuals in the sample were asked to state their preference i.e. their first choice. The data is
shown below. Test whether there is any relationship between sex and Beverage preference at 5%
level of significance.

Beverage preference

Coffee Tea Cocoa Total

Male 40 30 10 80

Sex Female 20 30 20 70

60 60 30 150
Solution

Step 1: Hypothesis

Ho: There is no relationship between Gender and beverage preference

H1: There is a relationship between Gender and beverage preference

Step 2:

Level of significance = 0.05

Step 3. Test statistic

Beverage preference
Coffee Tea Cocoa Total
Male 40(32) 30(32) 10(16) 80
Sex Female 20(28) 30 (28) 20(14) 70
60 60 30 150
Contingency table analysis is also referred to as cross tabulation analysis.

=
Step 4: Critical value and Decision

Degrees of freedom for the Chi square = (r-1)(c-1) = (2-1)(3-1) = 2

Comparing this (9.375) value using level of significant of 5%, (5.991)we find that tabulated
value is greater . Since the calculated value (9.375) is greater than the table value we reject the
null hypothesis.

Step 5: Conclusion
There is a relationship between gender and beverage preference.

Analysis of variance (ANOVA)

Example

A researcher wishes to try three different techniques to lower the blood pressure of individuals
diagnosed with blood pressure (Bp). the subjects are randomly assigned to three groups; the first
group take medication, the second group exercise and the third group special diet. After 4 weeks
the reduction in each person’s blood pressure is recorded. At test the claim that there is
no difference among the means. The data are shown below.

Medication 10 12 9 15 13
Exercise 6 8 3 0 2
Diet 5 9 12 8 4
Solution:
Step 1: Hypothesis

At least one of the means is different


Step 2: level of significance = 0.05
Step 3: Test statistic
Get the treatment totals as shown in the table below

            TOTAL mean
Medication 10 12 9 15 13 59 11.8
Exercise 6 8 3 0 2 19 3.8
Diet 5 9 12 8 4 38 7.6
            116

n =15
Correction term C =

Df for treatments

F (critical)=3.89. This value is obtained from F tables for

Total sum of squares

Treatment sum of squares

Sum of squares due to error (Within groups)


ANOVA table

Source of
variation SS d.f MSS F calculated F critical

treatment 160.133 3-1=2 160.133/2=80.067 80.067/8.73=9.17 3.89

Error 104.797 14-2=12 104.797/12=8.73   

Total 264.93 15-1=14     

Rejection
region

3.89
Step 4: Critical value and Decision

F(2,12) = 3.89

Since F calculated (9.17) is greater than F critical(3.89) we reject the null hypothesis

Step 5: Conclusion. The means are significantly different at 5% level of significance.

Example 2

Let’s assume we have 4 treatment groups A, B, C, and D. The summary statistics for the groups
are

Group ȳ s n
A 74.3 5.6 10
B 82.8 5.1 10
C 77.8 5.3 10
D 82.9 4.6 10

The ANOVA table for this data is

Source DF SS MS F p-value
Between groups 3 523.70 174.57 6.55 0.0012
Within groups (Error) 36 959.58 26.65
Total 39 1483.28

Since the F-test p-value is less than 0.05, we reject the null hypothesis H0 : µA = µB = µC =

µD at the 0.05 level.

Pairwise comparison of the means

At this point we are interested in doing pairwise comparisons of the means. That is, we want to
test hypotheses of the sort H0 : µA = µB, H0 : µA = µC, etc. The LSD method for testing the
hypothesis H0 : µA = µB proceeds as follows:

1. Calculate
LSD A , B =t 0 . 05/ 2 , edf
√ MSE( n1 +
A
1
nB
)
where edf is the error degrees of freedom and
MSE is the error mean sum of squares

For our example √1


LSD A , B =2. 021 26 . 65( 10 1
+ 10 )=4 . 67

2. If | ȳ A− ȳ B | ≥ LSDA,B then we reject the null hypothesis H0 : µA = µB. For our example |
ȳ A − ȳ B | = 8.5 which is greater than 4.67 so we reject H : µ = µ at the 0.05 level.
0 A B

We then continue to test all pairs of interest. In this case the LSD is the same for all pairs
because nA = nB = nC = nD. Thus LSDA,B = LSDA,C = ... = LSDC,D = 4.67 and we compare all
pairwise differences in the means to 4.67. Here are the absolute pairwise differences and the
results of Fisher’s LSD:

| ȳ A − ȳ B|=8.5≥4.67 reject Ho : μ A =μ B

| ȳ A − ȳ C|=3.5<4.67 Fail to reject Ho : μ A=μC

| ȳ A − ȳ D|=8.6≥4.67 reject H o : μ A =μ D

| ȳ B − ȳ C|=5 .0≥4.67 reject H o : μB=μ C

| ȳ B − ȳ D|=0.1<4.67 Fail to reject Ho : μ B=μ D

| ȳ C − ȳ D|=5.1≥4. 67 reject H o : μC =μ D
| ȳ A − ȳ C|=3 .5<4 . 67 Fail to reject Ho : μ A=μ B

You might also like