9 views

Uploaded by clinfox

Types of Reliability of Disease Classification

- Language Assessment CHAPTER 2
- Seven Myths About Emotional Intelligence
- Classroom Assessment Handouts Edu 405
- 85 (1)
- rubric
- Validity and Reliability of Research Instrument
- Assessment Decision Guide
- Measurement 1
- basc-2jan-2012.pdf
- Judging Whether a Survey is Valid and Reliable
- vocabulary defined
- A Meta-Analysis of Measures of Self-esteem for Young Children
- 44503366 Assessment of Student Learning 2 Clarity of Learning Targets
- Reliability.pdf
- Scaling Techniques
- SNA 12 Akpm33
- sem.imp
- Child Behavior Checklist
- donabedian quality.pdf
- Critical Appraisal 8

You are on page 1of 27

CLASSIFICATION

Nigel Paneth

TERMINOLOGY

Reliability is analogous to precision

Validity is analogous to accuracy

Reliability is how well an observer

classifies the same individual under

different circumstances.

Validity is how well a given test reflects

another test of known greater accuracy.

RELIABILITY AND VALIDITY

Reliability includes:

assessments of the same observer at

different times - INTRA-OBSERVER

RELIABILITY

assessments of different observers at the

same time - INTER-OBSERVER

RELIABILITY

Reliability assumes that all tests or

observers are equal; Validity assumes that

there is a gold standard to which a test or

observer should be compared.

ASSESSING RELABILITY

How do we assess reliability?

One way is to look simply at percent

agreement.

Percent agreement is the proportion

of all diagnoses classified the same

way by two observers.

EXAMPLE OF PERCENT

AGREEMENT

Two physicians are each given a

set of 100 X-rays to look at

independently and asked to judge

whether pneumonia is present or

absent. When both sets of

diagnoses are tallied, it is found that

95% of the diagnoses are the same.

IS PERCENT AGREEMENT

GOOD ENOUGH?

Do these two physicians exhibit high

diagnostic reliability?

Can there be 95% agreement between

two observers without really having

good reliablity?

Compare the two tables below:

Table 1 Table 2

MD#1

Yes No

MD#2

Yes 1 3

No 2 94

MD#1

Yes No

MD#2

Yes 43 3

No 2 52

In both instances, the physicians agree

95% of the time. Are the two physicians

equally reliable in the two tables?

MD#1

Yes No

MD#2

Yes 43 3

No 2 52

What is the essential difference between

the two tables?

The problem arises from the ease of

agreement on common events (e.g. not

having pneumonia in the first table).

So a measure of agreement should take

into account the ease of agreement

due to chance alone.

USE OF THE KAPPA

STATISTIC TO ASSESS

RELIABILITY

Kappa is a widely used test of

inter or intra-observer agreement

(or reliability) which corrects for

chance agreement.

KAPPA VARIES FROM + 1 to - 1

+ 1 means that the two observers are perfectly

reliable. They classify everyone exactly the

same way.

0 means there is no relationship at all

between the two observers classifications,

above the agreement that would be

expected by chance.

- 1 means the two observers classify exactly

the opposite of each other. If one observer

says yes, the other always says no.

GUIDE TO USE OF KAPPAS IN

EPIDEMIOLOGY AND MEDICINE

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

1

st

WAY TO CALCULATE

KAPPA

1. Calculate observed agreement (cells in

which the observers agree/total cells). In

both table 1 and table 2 it is 95%

2. Calculate expected agreement (chance

agreement) based on the marginal totals

Table 1s marginal totals

are:

OBSERVED

MD#1

Yes No

MD#2

Yes 1 3 4

No 2 94 96

3 97 100

How do we

calculate the N

expected by

chance in each

cell?

We assume that

each cell should

reflect the marginal

distributions, i.e.

the proportion of

yes and no

answers should be

the same within

the four-fold table

as in the marginal

totals.

OBSERVED MD #1

Yes No

MD#2 Yes 1 3 4

No 2 94 96

3 97 100

EXPECTED MD #1

Yes No

MD#2 Yes 4

No 96

3 97 100

To do this, we find the proportion of answers in either

the column (3% and 97%, yes and no respectively for

MD #1) or row (4% and 96% yes and no respectively

for MD #2) marginal totals, and apply one of the two

proportions to the other marginal total. For example,

96% of the row totals are in the No category.

Therefore, by chance 96% of MD #1s Nos should

also be in the No column. 96% of 97 is 93.12.

EXPECTED

MD#1

Yes No

MD#2 Yes 4

No 93.12 96

3 97 100

By subtraction, all other cells fill in

automatically, and each yes/no distribution

reflects the marginal distribution. Any cell

could have been used to make the calculation,

because once one cell is specified in a 2x2

table with fixed marginal distributions, all

other cells are also specified.

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Now you can see that just by the

operation of chance, 93.24 of the 100

observations should have been agreed

to by the two observers. (93.12 + 0.12)

EXPECTED MD #1

Yes No

MD#2 Yes 0.12 3.88 4

No 2.88 93.12 96

3 97 100

Lets now compare the actual agreement with

the expected agreement.

Expected agreement is 6.76% from perfect

agreement of 100% (100 93.24)

Actual agreement is 5.0% from perfect

agreement (100 95).

So our two observers were 1.76% better

than chance, but if they had agreed perfectly

they would have been 6.76% better than

chance. So they are really only about

better than chance (1.76/6.76)

Below is the formula for calculating

Kappa from expected agreement

Observed agreement - Expected Agreement

1 - Expected Agreement

95% - 93.24% = 1.76% = .26

1 - 93.24% 6.76%

How good is a Kappa of 0.26?

Kappa > .80 is considered excellent

Kappa .60 - .80 is considered good

Kappa .40 - .60 is considered fair

Kappa < .40 is considered poor

In the second example, the observed

agreement was also 95%, but the

marginal totals were very different

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 54

45 55 100

Using the same procedure as before,

we calculate the expected N in any one

cell, based on the marginal totals. For

example, the lower right cell is 54% of

55, which is 29.7

ACTUAL MD #1

Yes No

MD#2 Yes 46

No 29.7 54

45 55 100

And, by subtraction the other cells

are as below. The cells which

indicate agreement are highlighted

in yellow, and add up to 50.4%

ACTUAL MD #1

Yes No

MD#2 Yes 20.7 25.3 46

No 24.3 29.7 54

45 55 100

Enter the two agreements into the formula:

Observed agreement - Expected Agreement

1 - Expected Agreement

95% - 50.4% = 44.6% = .90

1 - 50.4% 49.6%

In this example, the observers have the

same % agreement, but now they are

much different from chance.

Kappa of 0.90 is considered excellent

A 2

nd

WAY TO CALCULATE

THE KAPPA STATISTIC

MD#1

Yes No

MD#2

Yes A B

N

1

No C D N

2

N

3

N

4

total

2(AD - BC)

N

1

N

4

+ N

2

N

3

where the Ns are the marginal totals, labeled

thus:

Look again at the tables on slide 7.

For Table 1:

2(94 x 1 - 2 x 3) = 176 = .26

4 x 97 + 3 x 96 676

For Table 2:

2(52 x 43 - 3 x 2) = 4460 = .90

46 x 55 + 45 x 54 4960

Note parallels between:

THE ODDS RATIO

THE CHI-SQUARE STATISTIC

THE KAPPA STATISTIC

Note that the cross-products of the

four-fold table, and their relation to

marginal totals, are central to all

three expressions

- Language Assessment CHAPTER 2Uploaded bydimerx22
- Seven Myths About Emotional IntelligenceUploaded bydulcinea31
- Classroom Assessment Handouts Edu 405Uploaded byGohar Yaseen
- 85 (1)Uploaded byrlynmndz
- rubricUploaded byapi-304124523
- Validity and Reliability of Research InstrumentUploaded byjeffersonsubiate
- Assessment Decision GuideUploaded bystillaire
- Measurement 1Uploaded bykami5185
- basc-2jan-2012.pdfUploaded byEspíritu Ciudadano
- Judging Whether a Survey is Valid and ReliableUploaded bywested-magnet-schools
- vocabulary definedUploaded byapi-279416529
- A Meta-Analysis of Measures of Self-esteem for Young ChildrenUploaded bykarinadaparia
- 44503366 Assessment of Student Learning 2 Clarity of Learning TargetsUploaded bySandino Romano
- Reliability.pdfUploaded byDjidelSaid
- Scaling TechniquesUploaded bySatya Sharma
- SNA 12 Akpm33Uploaded byayudrahayu
- sem.impUploaded byVijay Mallik Raj S
- Child Behavior ChecklistUploaded byChishti Nizami
- donabedian quality.pdfUploaded byPasquale Cacciatore
- Critical Appraisal 8Uploaded byDebi Sumarli
- Spring2010_midtermSTA630(2)Uploaded byMuhammad Zahid Fareed
- 1Uploaded byAfaq Ahmed Soomro
- More on ICCsUploaded byAlvian Fachrurrozi
- Science and Research-1Uploaded byFaisal Malik
- 2018-05-0720181158Matthews Kath Barnes-Farrell 2010 a Short Valid Predictive Measure of Work–Family ConflictUploaded byAlejandra Jana Vera
- BTEC Level 2 Extended Certificate in Music Assignment Sheet Unit 16Uploaded byeasternhigh
- Mkt ResearchUploaded bysudeepraj
- The Influence of Hospital Brand Image and Service Quality on the Patient 2Uploaded byEni Trismiati
- Conductas agresivas en la demencia.pdfUploaded byRosa Seco
- 10Uploaded byMaria Fudji Hastuti

- Disability Mental Health AUploaded byclinfox
- Cleaners GuidelinesUploaded byclinfox
- Big Data Final ReportUploaded byclinfox
- MSME GO-SSIUploaded byclinfox
- Tackling Health Inequalities_0Uploaded byclinfox
- Notes on RR, Odds, Or, AR, AR% - EpidemiologyUploaded byclinfox
- VoteOfThanks AloysiusUploaded byclinfox
- IGNOU HAND BOOKUploaded bySri Sai
- Case Study_ Safety_Pharmacovigilance #2 _ ParexelUploaded byclinfox
- 2008 Planning Commission Report on PwD EmploymentUploaded byclinfox
- Burden of OutofPocket Health Payments in Andhra PradeshUploaded byclinfox
- Burden of OutofPocket Health Payments in Andhra PradeshUploaded byclinfox
- ymi-09-0036.pdfUploaded byclinfox
- Hypothesis TestingUploaded byclinfox
- alliancehpsr_backgroundpaperconceptualbarriersopportunitiesUploaded byclinfox
- What is GenderUploaded byclinfox
- (Www.entrance-exam.net)-Virtusa Placement Sample Paper 1Uploaded byclinfox
- Conceptual ModelsUploaded byclinfox
- PhD in Clinical ResarchUploaded byclinfox
- 3J0gHHNM_JICUploaded byclinfox
- Sokal y Rohlf BioestadisticaUploaded bymavellanal
- 21 City 1Uploaded byclinfox
- ClinFOX International Business PresentationUploaded byclinfox
- Sanjeevani Metadilab Meeting PicsUploaded byclinfox
- Medical Writing a Prescription for ClarityUploaded byPrakash Janakiraman
- Suraksha Independent Ethics CommitteeUploaded byclinfox
- My World of Preventive Medicine-EditedUploaded byclinfox
- Gender and health conceptsUploaded byclinfox
- ClinFOX TAU PhD Clinical ResearchUploaded byclinfox

- Uwi Open CampusUploaded byNavin Narine
- Nur 302 Health AssessmentUploaded byCandy Yip
- Training 1Uploaded byPrajay Mathur
- 2011_CETM08 Assignment (1)Uploaded byagha_mir19
- Study Guide CEG UpdatedUploaded bySamma Noor
- lecture 0Uploaded byMuhammadHaseeb
- bio edexcel jan 2017Uploaded bydensity
- CSSBB Guide PDFUploaded bymammutbalaji
- Cs Qmit Qmt109 Valentin a Jta b 2017 2Uploaded byGabriel Roxas Untalan
- Internship Final Report_GuidelineUploaded bykuningan
- 202 Effective EvaluationUploaded bythevinnie
- English ReportUploaded byMichelle Eusebio
- Resonance AIEEE IIT Study Material Maths CompleteUploaded bySomnath Biswal
- sports writing winter 2017Uploaded byapi-293442863
- OMIS 455_01_S2012Uploaded byAndrew Setterstrom
- 1 Introduction to Industrial Exp 2003 Design of Experiments for EngineersUploaded byMario Eduardo Santos Martins
- Statistics Review Math 7Uploaded bykatielavey
- Noti.pdfUploaded byArockia Raja
- Beyond the Hedonic TreadmillUploaded byLeroy Rice
- AQA-4655-W-SP-2014Uploaded byoanadragan
- Employee DisciplineUploaded byFăÍż SăįYąð
- FTRE Brochure.pdfUploaded bySathwik
- Birth order & intelligenceUploaded bymoo2elyn
- MS3 Guidance From WJEC on Media Studies Course WorkUploaded bypierrette1
- Opening 1305Uploaded byLo Shun Fat
- FB11001 Reliability and Validity in Qualitative Research SummaryUploaded byAkshay Bhat
- CMA Exam StrategiesUploaded byAlthaf Ahamed
- 5090_w11_qp_21Uploaded bymstudy123456
- National Examination Should Be AbolishedUploaded byInayah Gorjess Pazzer
- hdfs421 lab 3 tio kristinaUploaded byapi-365230869