You are on page 1of 3

25/09/2018 Fleiss' Kappa | Real Statistics Using Excel

Real Statistics Using Excel


Everything you need to do real
statistical analysis using Excel

Fleiss’ Kappa
Cohen’s kappa is a measure of the agreement between two raters, where agreement due to chance is factored
out. We now extend Cohen’s kappa to the case where the number of raters can be more than two. This extension is
called Fleiss’ kappa. As for Cohen’s kappa no weighting is used and the categories are considered to be
unordered.

Let n = the number of subjects, k = the number of evaluation categories and m = the number of judges for each
subject. E.g. for Example 1 of Cohen’s Kappa, n = 50, k = 3 and m = 2. While for Cohen’s kappa both judges
evaluate every subject, in the case of Fleiss’ kappa, there may be many more than m judges and not every judge
needs to evaluate each subject; what is important is that each subject is evaluated m times.

For every subject i = 1, 2, …, n and evaluation categories j = 1, 2, …, k, let xij = the number of judges that assign
category j to subject i. Thus

The proportion of pairs of judges that agree in their evaluation on subject i is given by

The mean of the pi is therefore

We use the following measure for the error term

where

Definition 1: Fleiss’ Kappa is defined to be

We can also define kappa for the jth category by

The standard error for κj is given by the formula

http://www.real-statistics.com/reliability/fleiss-kappa/ 1/29
25/09/2018 Fleiss' Kappa | Real Statistics Using Excel

The standard error for κ is given by the formula

There is an alternative calculation of the standard error provided in Fleiss’ orginal paper, namely the square root
of the following:

The test statistics zj = κj/s.e.(κj) and z = κ/s.e. are generally approximated by a standard normal distribution,
which allows us to calculate a p-value and confidence interval. E.g. the 1 – α confidence interval for kappa is
therefore approximated as

κ ± NORMSINV(1 – α/2) * s.e.

Example 1: Six psychologists (judges) evaluate 12 patients as to whether they are psychotic, borderline, bipolar
or none of these. The rating are summarized in range A3:E15 of Figure 1. Determine the overall agreement
between the psychologists, subtracting out agreement due to chance, using Fleiss’ kappa. Also find Fleiss’ kappa
for each disorder.

Figure 1 – Calculation of Fleiss’ Kappa

For example, we see that 4 of the psychologists rated subject 1 to have psychosis and 2 rated subject 1 to have
borderline syndrome, no psychologist rated subject 1 with bipolar or none.

We use the formulas described above to calculate Fleiss’ kappa in the worksheet shown in Figure 1. The formulas
in the ranges H4:H15 and B17:B22 are displayed in text format in column J, except that the formulas in cells H9
and B19 are not displayed in the figure since they are rather long. These formulas are:

Cell Entity Formula

http://www.real-statistics.com/reliability/fleiss-kappa/ 2/29
25/09/2018 Fleiss' Kappa | Real Statistics Using Excel

H9 s.e. =B20*SQRT(SUM(B18:E18)^2-SUMPRODUCT(B18:E18,1-2*B17:E17))/SUM(B18:E18)

B19 κ1 =1-SUMPRODUCT(B4:B15,$H$4-B4:B15)/($H$4*$H$5*($H$4-1)*B17*(1-B17))

Figure 2 – Long formulas in worksheet of Figure 1

Note too that row 18 (labelled b) contains the formulas for qj(1–qj).

The p-values (and confidence intervals) show us that all of the kappa values are significantly different from zero.

Real Statistics Function: The Real Statistics Resource Pack contains the following supplemental function:

KAPPA(R1, j, lab, alpha, tails, orig): if lab = FALSE (default) returns a 6 × 1 range consisting of κ if j = 0
(default) or κj if j > 0 for the data in R1 (where R1 is formatted as in range B4:E15 of Figure 1), plus the
standard error, z-stat, z-crit, p-value and lower and upper bound of the 1 – alpha confidence interval, where
alpha = α (default .05) and tails = 1 or 2 (default). If lab = TRUE then an extra column of labels is included in
the output. If orig = TRUE then the original calculation for the standard error is used; default is FALSE.

For Example 1, KAPPA(B4:E15) = .2968 and KAPPA(B4:E15,2) = .28. The complete output
for KAPPA(B4:E15,,TRUE) is shown in Figure 3.

Figure 3 – Output from KAPPA function

Real Statistics Data Analysis Tool: The Reliability data analysis tool supplied in the Real Statistics Resource
Pack can also be used to calculate Fleiss’ kappa.

To calculate Fleiss’ kappa for Example 1 press Ctrl-m and choose the Reliability option from the menu that
appears. Fill in the dialog box that appears (see Figure 7 of Cronbach’s Alpha) by inserting B4:E15 in the Input
Range, choosing the Fleiss’ kappa option and clicking on the OK button..

The output is shown in Figure 4.

Figure 4 – Output from Fleiss’ Kappa analysis tool

http://www.real-statistics.com/reliability/fleiss-kappa/ 3/29

You might also like