Performance Measure 1.2

Model Development and Interpretation 293
The predictions made by the model are with respect to the classes of the outcome vari-
able (also referred as dependent variable) of a problem, which is under consideration.
For example, if the outcome variable of the problem has two classes, then that problem is
referred to as a binary problem. Similarly, if the outcome variable has three classes, then
that problem is known as a three-class problem, and so on.
Consider the confusion matrix given in Table 7.4 for a two-class problem, where the out-
come variable consists of positive and negative values.
The following measures are used in the confusion matrix:
• True positive (TP): Refers to the number of correctly predicted positive instances
• False negative (FN): Refers to the number of incorrectly predicted positive instances
• False positive (FP): Refers to number of incorrectly predicted negative instances
• True negative (TN): Refers to number of correctly predicted negative instances
Now, consider a three-class problem where an outcome variable consists of three classes,
C1, C2, and C3,, as shown in Table 7.5.
From the above confusion matrix, we will get the values of TP, FN, FP, and TN corre-
sponding to each of the three classes, C1, C2, and C3, as shown in Figures 7.6 through 7.8.
Table 7.6 depicts the confusion matrix corresponding to class C1. This table is derived
from Table 7.5, which shows the confusion matrix for all the three classes C1, C2, and C3. In
Table 7.6, the number of TP instances are “a,” where “a” are the class C1 instances that are
correctly classified as belonging to class C1. The “b” and “c” are the class C1 instances that
are incorrectly labeled as belonging to class C2 and class C3, respectively. Therefore, these
instances come under the category of FN. On the other hand, d and g are the instances
belonging to class C2 and class C3, respectively, and they have been incorrectly marked
as belonging to class C1 by the prediction model. Hence, they are FP instances. The e, f, h,
and i are all the remaining samples that are correctly classified as nonclass C1 instances.
TABLE 7.4
Confusion Matrix for Two-Class
Outcome Variables
Predicted
Positive Negative
Actual Positive TP FN
Negative FP TN
TABLE 7.5
Confusion Matrix for Three-Class
Outcome Variables
Predicted
C1 C2 C3
Actual C1 a b c
C2 d e f
C3 g h i
294 Empirical Research in Software Engineering
TABLE 7.6
Confusion Matrix for Class “C1”
Predicted
C1 Not C1
Actual C1 TP = a FN = b + c
Not C1 FP = d + g TN = e + f +h + i
TABLE 7.7
Predicted
C2 Not C2
Actual C2 TP = e FN = d + f
Not C2 FP = b + h TN = a + c + g + i
TABLE 7.8
Predicted
C3 Not C 3
Actual C3 TP = i FN = g + h
Not C 3 FP = c + f TN = a + b + d + e
Therefore, they are referred to as TN instances. Similarly, Tables 7.7 and 7.8 depict the con-
fusion matrix for classes C2 and C3.
7.5.2 Sensitivity and Specificity

Sensitivity is defined as the ratio of correctly classified positive instances to the total num-
ber of actual positive instances. It is also referred to as recall or true positive rate (TPR).
If we get a sensitivity value of 1.0 for a particular class C, then this means that all the
instances that belong to class C are correctly classified as belonging to class C. Sensitivity
is given by the following formula:
TP
Sensitivity or recall(Rec) = ×100
TP+FN
But, the important point to note here is that this value comments nothing about the other
instances, which do not belong to class C, but are still incorrectly classified as belonging
to class C.
Specificity is defined as the ratio of correctly classified negative instances to the total
number of actual negative instances. It is given by the following formula:
TN
Specificity = ×100
FP+TN
Model Development and Interpretation 297
TABLE 7.10
Performance Measures for Confusion Matrix given in Table 7.7
Performance Measures Formula Values Obtained Results
Sensititvity or recall (Rec) TP 516 95.37

× 100 × 100
TP + FN 516 + 25
Specificity TN 725 98.63
× 100 × 100
FP + TN 10 + 725
Accuracy TP + TN 516 + 725 97.25
× 100 × 100
TP + FN + FP + TN 516 + 25 + 10 + 725
Precision (Pre) TP 516 0.981
TP + FP 516 + 10
F-measure 2 × Pre × Rec 2 × 0.981 × 0.954 0.967
Pre + Rec 0.981 + 0.954
a+ TP 516 0.981
TP + FP 516 + 10
a− TN 725 0.967
TN + FN 725 + 25
FPR FP 10 1.90
× 100 × 100
FP + TN 10 + 5 + 6
G-measure 2 × Recall × ( 100 − FPR ) 2 × 95.4 × (100 − 1.90 ) 96.73
Recall + ( 100 − FPR ) 95.4 + ( 100 − 1.90 )
G-mean (a +) × (a −) 0.981 × 0.967 0.973
TABLE 7.11
Confusion Matrix for Three-Class Outcome Variable
Predicted
High (1) Medium (2) Low (3)
Actual High (1) 3 9 0

Medium (2) 3 34 1
Low (3) 1 4 5
Solution:
From the confusion matrix given in Table 7.11, the values of TP, FN, FP, and TN are
derived and corresponding to each of the three classes high (1), medium (2), and low (3),
and are shown in Tables 7.12 through 7.14.
The value of different performance measures at each severity level, namely, high, medium,
and low on the basis of Tables 7.12 through 7.14 are given in Table 7.15.
TABLE 7.12
Confusion Matrix for Class “High”
Predicted
High Not High
Actual High TP = 3 FN = 9
Not high FP = 4 TN = 44
298 Empirical Research in Software Engineering
TABLE 7.13
Confusion Matrix for Class “Medium”
Predicted
Medium Not Medium
Actual Medium TP = 34 FN = 4
Not medium FP = 13 TN = 9
TABLE 7.14
Confusion Matrix for Class “Low”
Predicted
Low Not Low
Actual Low TP = 5 FN = 5
Not low FP = 1 TN = 49
7.5.6 Receiver Operating Characteristics Analysis

Receiver operating characteristic (ROC) analysis is one of the most popular techniques
used to measure the accuracy of the prediction model. It determines how well the model
has worked on the test data. It is used in situations where a decision between two possible
outcomes is to be made. For example, whether a sample belongs to change-prone or not
change-prone class.
ROC analysis is well suited for problems with binary outcome variable, that is, when
the outcome variable has two possible outcome values. In case of multinomial outcome
variable, that is, when the outcome variable has three or more possible groups of outcome
values, then separate prediction functions are generated for each of the groups, and then
ROC analysis is done for each of the group individually.
Often the values of outcome variables are referred as target and reference group (Meyers
et al. 2013). Now, the question arises as to what should be considered as the target group.
For example, if the outcome variable has two classes A and B. Then, which outcome class
should be considered as target group and reference group? Usually, the target group is
the one that would satisfy the condition that we need to identify or predict. Therefore, it is
referred to as the group of positive outcomes. The remaining instances correspond to the
alterative group referred to as the “reference group.” These instances are referred to as the
negative outcomes, as shown in Table 7.4.
7.5.6.1 ROC Curve

One of the most important characteristics of the ROC analysis is the curve. ROC curve
is the visual representation that is used to picture the overall accuracy of the prediction
model. The ROC curve is defined as a plot between sensitivity on the y-coordinate and
1-specificity on the x-coordinate (Hanley and McNeil 1982; El Emam et al. 1999). So, we
can say that ROC curve is a plot of the TP rate against the FP rate at different possible
threshold values (cutoff points). It is represented by a graph together with a diagonal line,
as shown in Figure 7.13. This diagonal line represents a random model that has no pre-
dictive power. We can also interpret the curve by saying that there is a tradeoff between
sensitivity and specificity in the sense that any increase in the value of sensitivity will

Performance Measure 1.2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Performance Measure 1.2

Uploaded by

Copyright:

Available Formats

Model Development and Interpretation 293

7.5.2 Sensitivity and Specificity

Sensititvity or recall (Rec) TP 516 95.37

G-mean (a +) × (a −) 0.981 × 0.967 0.973

Actual High (1) 3 9 0

7.5.6 Receiver Operating Characteristics Analysis

7.5.6.1 ROC Curve

You might also like