Types of Reliability of Disease Classification

CLASSIFICATION

Nigel Paneth

TERMINOLOGY

Reliability is analogous to precision

Validity is analogous to accuracy

Reliability is how well an observer

classifies the same individual under

different circumstances.

Validity is how well a given test reflects

another test of known greater accuracy.

RELIABILITY AND VALIDITY

Reliability includes:

assessments of the same observer at

different times - INTRA-OBSERVER

RELIABILITY

assessments of different observers at the

same time - INTER-OBSERVER

RELIABILITY

Reliability assumes that all tests or

observers are equal; Validity assumes that

there is a gold standard to which a test or

observer should be compared.

ASSESSING RELABILITY

How do we assess reliability?

One way is to look simply at percent

agreement.

Percent agreement is the proportion

of all diagnoses classified the same

way by two observers.

EXAMPLE OF PERCENT

AGREEMENT

Two physicians are each given a

set of 100 X-rays to look at

independently and asked to judge

whether pneumonia is present or

absent. When both sets of

diagnoses are tallied, it is found that

95% of the diagnoses are the same.

IS PERCENT AGREEMENT

GOOD ENOUGH?

Do these two physicians exhibit high

diagnostic reliability?

Can there be 95% agreement between

two observers without really having

good reliablity?

Compare the two tables below:

Table 1 Table 2

MD#1

Yes No

MD#2

Yes 1 3

No 2 94

MD#1

Yes No

MD#2

Yes 43 3

No 2 52

In both instances, the physicians agree

95% of the time. Are the two physicians

equally reliable in the two tables?

MD#1

Yes No

MD#2

Yes 43 3

No 2 52

What is the essential difference between

the two tables?

The problem arises from the ease of

agreement on common events (e.g. not

having pneumonia in the first table).

So a measure of agreement should take

into account the ease of agreement

due to chance alone.

USE OF THE KAPPA

STATISTIC TO ASSESS

RELIABILITY

Kappa is a widely used test of

inter or intra-observer agreement

(or reliability) which corrects for

chance agreement.

KAPPA VARIES FROM + 1 to - 1

+ 1 means that the two observers are perfectly

reliable. They classify everyone exactly the

same way.

0 means there is no relationship at all

between the two observers classifications,

above the agreement that would be

expected by chance.

- 1 means the two observers classify exactly

the opposite of each other. If one observer

says yes, the other always says no.

GUIDE TO USE OF KAPPAS IN

EPIDEMIOLOGY AND MEDICINE

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

1

st

WAY TO CALCULATE

KAPPA

1. Calculate observed agreement (cells in

which the observers agree/total cells). In

both table 1 and table 2 it is 95%

2. Calculate expected agreement (chance

agreement) based on the marginal totals

Table 1s marginal totals

are:

OBSERVED

MD#1

Yes No

MD#2

Yes 1 3 4

No 2 94 96

3 97 100

How do we

calculate the N

expected by

chance in each

cell?

We assume that

each cell should

reflect the marginal

distributions, i.e.

the proportion of

yes and no

answers should be

the same within

the four-fold table

as in the marginal

totals.

OBSERVED MD #1

Yes No

MD#2 Yes 1 3 4

No 2 94 96

3 97 100

EXPECTED MD #1

Yes No

MD#2 Yes 4

No 96

3 97 100

To do this, we find the proportion of answers in either

the column (3% and 97%, yes and no respectively for

MD #1) or row (4% and 96% yes and no respectively

for MD #2) marginal totals, and apply one of the two

proportions to the other marginal total. For example,

96% of the row totals are in the No category.

Therefore, by chance 96% of MD #1s Nos should

also be in the No column. 96% of 97 is 93.12.

EXPECTED

MD#1

Yes No

MD#2 Yes 4

No 93.12 96

3 97 100

By subtraction, all other cells fill in

automatically, and each yes/no distribution

reflects the marginal distribution. Any cell

could have been used to make the calculation,

because once one cell is specified in a 2x2

table with fixed marginal distributions, all

other cells are also specified.

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Now you can see that just by the

operation of chance, 93.24 of the 100

observations should have been agreed

to by the two observers. (93.12 + 0.12)

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Lets now compare the actual agreement with

the expected agreement.

Expected agreement is 6.76% from perfect

agreement of 100% (100 93.24)

Actual agreement is 5.0% from perfect

agreement (100 95).

So our two observers were 1.76% better

than chance, but if they had agreed perfectly

they would have been 6.76% better than

chance. So they are really only about

better than chance (1.76/6.76)

Below is the formula for calculating

Kappa from expected agreement

Observed agreement - Expected Agreement

1 - Expected Agreement

95% - 93.24% = 1.76% = .26

1 - 93.24% 6.76%

How good is a Kappa of 0.26?

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

In the second example, the observed

agreement was also 95%, but the

marginal totals were very different

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 54

45 55 100

Using the same procedure as before,

we calculate the expected N in any one

cell, based on the marginal totals. For

example, the lower right cell is 54% of

55, which is 29.7

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 29.7 54

45 55 100

And, by subtraction the other cells

are as below. The cells which

indicate agreement are highlighted

in yellow, and add up to 50.4%

ACTUAL MD #1

Yes No

MD#2 Yes 20.7 25.3 46

No 24.3 29.7 54

45 55 100

Enter the two agreements into the formula:

Observed agreement - Expected Agreement

1 - Expected Agreement

95% - 50.4% = 44.6% = .90

1 - 50.4% 49.6%

In this example, the observers have the

same % agreement, but now they are

much different from chance.

Kappa of 0.90 is considered excellent

A 2

nd

WAY TO CALCULATE

THE KAPPA STATISTIC

MD#1

Yes No

MD#2

Yes A B

N

1

No C D N

2

N

3

N

4

total

2(AD - BC)

N

1

N

4

+ N

2

N

3

where the Ns are the marginal totals, labeled

thus:

Look again at the tables on slide 7.

For Table 1:

2(94 x 1 - 2 x 3) = 176 = .26

4 x 97 + 3 x 96 676

For Table 2:

2(52 x 43 - 3 x 2) = 4460 = .90

46 x 55 + 45 x 54 4960

Note parallels between:

THE ODDS RATIO

THE CHI-SQUARE STATISTIC

THE KAPPA STATISTIC

Note that the cross-products of the

four-fold table, and their relation to

marginal totals, are central to all

three expressions

