Fleiss Kappa - Real Statistics Using Excel

25/09/2018 Fleiss' Kappa | Real Statistics Using Excel
Real Statistics Using Excel

Everything you need to do real
statistical analysis using Excel
Fleiss’ Kappa
Cohen’s kappa is a measure of the agreement between two raters, where agreement due to chance is factored
out. We now extend Cohen’s kappa to the case where the number of raters can be more than two. This extension is
called Fleiss’ kappa. As for Cohen’s kappa no weighting is used and the categories are considered to be
unordered.
Let n = the number of subjects, k = the number of evaluation categories and m = the number of judges for each
subject. E.g. for Example 1 of Cohen’s Kappa, n = 50, k = 3 and m = 2. While for Cohen’s kappa both judges
evaluate every subject, in the case of Fleiss’ kappa, there may be many more than m judges and not every judge
needs to evaluate each subject; what is important is that each subject is evaluated m times.
For every subject i = 1, 2, …, n and evaluation categories j = 1, 2, …, k, let xij = the number of judges that assign
category j to subject i. Thus
The proportion of pairs of judges that agree in their evaluation on subject i is given by
The mean of the pi is therefore
We use the following measure for the error term
where
Definition 1: Fleiss’ Kappa is defined to be
We can also define kappa for the jth category by
The standard error for κj is given by the formula
http://www.real-statistics.com/reliability/fleiss-kappa/ 1/29
The standard error for κ is given by the formula
There is an alternative calculation of the standard error provided in Fleiss’ orginal paper, namely the square root
of the following:
The test statistics zj = κj/s.e.(κj) and z = κ/s.e. are generally approximated by a standard normal distribution,
which allows us to calculate a p-value and confidence interval. E.g. the 1 – α confidence interval for kappa is
therefore approximated as
κ ± NORMSINV(1 – α/2) * s.e.
Example 1: Six psychologists (judges) evaluate 12 patients as to whether they are psychotic, borderline, bipolar
or none of these. The rating are summarized in range A3:E15 of Figure 1. Determine the overall agreement
between the psychologists, subtracting out agreement due to chance, using Fleiss’ kappa. Also find Fleiss’ kappa
for each disorder.
Figure 1 – Calculation of Fleiss’ Kappa
For example, we see that 4 of the psychologists rated subject 1 to have psychosis and 2 rated subject 1 to have
borderline syndrome, no psychologist rated subject 1 with bipolar or none.
We use the formulas described above to calculate Fleiss’ kappa in the worksheet shown in Figure 1. The formulas
in the ranges H4:H15 and B17:B22 are displayed in text format in column J, except that the formulas in cells H9
and B19 are not displayed in the figure since they are rather long. These formulas are:
Cell Entity Formula
H9 s.e. =B20*SQRT(SUM(B18:E18)^2-SUMPRODUCT(B18:E18,1-2*B17:E17))/SUM(B18:E18)
B19 κ1 =1-SUMPRODUCT(B4:B15,$H$4-B4:B15)/($H$4*$H$5*($H$4-1)*B17*(1-B17))
Figure 2 – Long formulas in worksheet of Figure 1
Note too that row 18 (labelled b) contains the formulas for qj(1–qj).
The p-values (and confidence intervals) show us that all of the kappa values are significantly different from zero.
Real Statistics Function: The Real Statistics Resource Pack contains the following supplemental function:
KAPPA(R1, j, lab, alpha, tails, orig): if lab = FALSE (default) returns a 6 × 1 range consisting of κ if j = 0
(default) or κj if j > 0 for the data in R1 (where R1 is formatted as in range B4:E15 of Figure 1), plus the
standard error, z-stat, z-crit, p-value and lower and upper bound of the 1 – alpha confidence interval, where
alpha = α (default .05) and tails = 1 or 2 (default). If lab = TRUE then an extra column of labels is included in
the output. If orig = TRUE then the original calculation for the standard error is used; default is FALSE.
For Example 1, KAPPA(B4:E15) = .2968 and KAPPA(B4:E15,2) = .28. The complete output
for KAPPA(B4:E15,,TRUE) is shown in Figure 3.
Figure 3 – Output from KAPPA function
Real Statistics Data Analysis Tool: The Reliability data analysis tool supplied in the Real Statistics Resource
Pack can also be used to calculate Fleiss’ kappa.
To calculate Fleiss’ kappa for Example 1 press Ctrl-m and choose the Reliability option from the menu that
appears. Fill in the dialog box that appears (see Figure 7 of Cronbach’s Alpha) by inserting B4:E15 in the Input
Range, choosing the Fleiss’ kappa option and clicking on the OK button..
The output is shown in Figure 4.
Figure 4 – Output from Fleiss’ Kappa analysis tool

Fleiss Kappa - Real Statistics Using Excel

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fleiss Kappa - Real Statistics Using Excel

Uploaded by

Copyright:

Available Formats

25/09/2018 Fleiss' Kappa | Real Statistics Using Excel

Real Statistics Using Excel

The mean of the pi is therefore

We use the following measure for the error term

Definition 1: Fleiss’ Kappa is defined to be

We can also define kappa for the jth category by

The standard error for κj is given by the formula

The standard error for κ is given by the formula

κ ± NORMSINV(1 – α/2) * s.e.

Figure 1 – Calculation of Fleiss’ Kappa

Cell Entity Formula

Figure 2 – Long formulas in worksheet of Figure 1

Figure 3 – Output from KAPPA function

The output is shown in Figure 4.

Figure 4 – Output from Fleiss’ Kappa analysis tool

You might also like