Professional Documents
Culture Documents
C O N F U S I O N M AT R I X
In the previous chapters of our
Machine Learning tutorial
(Neural Networks with Python
and Numpy and Neural
Networks from Scratch ) we
implemented various
algorithms, but we didn't
properly measure the quality of
the output. The main reason was
that we used very simple and
small datasets to learn and test.
In the chapter Neural Network:
Testing with MNIST, we will work with large datasets and ten classes, so we need proper
evaluations tools. We will introduce in this chapter the concepts of the confusion matrix:
A confusion matrix is a matrix (table) that can be used to measure the performance of an
machine learning algorithm, usually a supervised learning one. Each row of the confusion
matrix represents the instances of an actual class and each column represents the instances of
a predicted class. This is the way we keep it in this chapter of our tutorial, but it can be the
other way around as well, i.e. rows for predicted classes and columns for actual classes. The
name confusion matrix reflects the fact that it makes it easy for us to see what kind of
confusions occur in our classification algorithms. For example the algorithms should have
predicted a sample as c because the actual class is c , but the algorithm came out with c . In
i i j
this case of mislabelling the element cm[i, j] will be incremented by one, when the confusion
matrix is constructed.
We will define methods to calculate the confusion matrix, precision and recall in the following
class.
2-CLASS CASE
In a 2-class case, i.e. "negative" and "positive", the confusion matrix may look like this:
predicted
https://www.python-course.eu/confusion_matrix.php 1/6
4/19/2019 Machine Learning with Python: Confusion Matrix in Machine Learning with Python
negative 11 0
positive 1 12
predicted
negative TN FP
True positive False Positive
positive FN TP
False negative True positive
We can define now some important performance measures used in machine learning:
Accuracy:
TN +TP
AC =
TN +FP +FN +TP
The accuracy is not always an adequate performance measure. Let us assume we have 1000
samples. 995 of these are negative and 5 are positive cases. Let us further assume we have a
classifier, which classifies whatever it will be presented as negative. The accuracy will be a
surprising 99.5%, even though the classifier could not recognize any positive samples.
TP
recall =
FN +TP
FP
T NR =
TN +FP
Precision:
https://www.python-course.eu/confusion_matrix.php 2/6
4/19/2019 Machine Learning with Python: Confusion Matrix in Machine Learning with Python
TP
precision :
FP +TP
MULTI-CLASS CASE
To measure the results of machine learning algorithms, the previous confusion matrix will not
be sufficient. We will need a generalization for the multi-class case.
Let us assume that we have a sample of 25 animals, e.g. 7 cats, 8 dogs, and 10 snakes, most
probably Python snakes. The confusion matrix of our recognition algorithm may look like the
following table:
predicted
dog 6 2 0
cat 1 6 0
snake 1 1 8
In this confusion matrix, the system correctly predicted six of the eight actual dogs, but in two
cases it took a dog for a cat. The seven acutal cats were correctly recognized in six cases but
in one case a cat was taken to be a dog. Usually, it is hard to take a snake for a dog or a cat,
but this is what happened to our classifier in two cases. Yet, eight out of ten snakes had been
correctly recognized. (Most probably this machine learning algorithm was not written in a
Python program, because Python should properly recognize its own species :-) )
You can see that all correct predictions are located in the diagonal of the table, so prediction
errors can be easily found in the table, as they will be represented by values outside the
diagonal.
We can generalize this to the multi-class case. To do this we summarize over the rows and
columns of the confusion matrix. Given that the matrix is oriented as above, i.e., that a given
row of the matrix corresponds to specific value for the "truth", we have:
https://www.python-course.eu/confusion_matrix.php 3/6
4/19/2019 Machine Learning with Python: Confusion Matrix in Machine Learning with Python
M ii
P recisioni =
∑ M ji
j
M ii
Recalli =
∑ M ij
j
This means, precision is the fraction of cases where the algorithm correctly predicted class i
out of all instances where the algorithm predicted i (correctly and incorrectly). recall on the
other hand is the fraction of cases where the algorithm correctly predicted i out of all of the
cases which are labelled as i.
precisionsnakes = 8/(0 + 0 + 8) = 1
EXAMPLE
We are ready now to code this into Python. The following code shows a confusion matrix for
a multi-class machine learning problem with ten labels, so for example an algorithms for
recognizing the ten digits from handwritten characters.
If you are not familiar with Numpy and Numpy arrays, we recommend our tutorial on Numpy.
import numpy as np
cm = np.array(
[[5825, 1, 49, 23, 7, 46, 30, 12, 21, 26],
[ 1, 6654, 48, 25, 10, 32, 19, 62, 111, 10],
[ 2, 20, 5561, 69, 13, 10, 2, 45, 18, 2],
https://www.python-course.eu/confusion_matrix.php 4/6
4/19/2019 Machine Learning with Python: Confusion Matrix in Machine Learning with Python
The functions 'precision' and 'recall' calculate values for a label, whereas the function
'precision_macro_average' the precision for the whole classification problem calculates.
https://www.python-course.eu/confusion_matrix.php 5/6
4/19/2019 Machine Learning with Python: Confusion Matrix in Machine Learning with Python
def accuracy(confusion_matrix):
diagonal_sum = confusion_matrix.trace()
sum_of_all_elements = confusion_matrix.sum()
return diagonal_sum / sum_of_all_elements
accuracy(cm)
After having executed the Python code above we received the following:
0.95038333333333336
© 2011 - 2018, Bernd Klein, Bodenseo; Design by Denise Mitchinson adapted for python-course.eu by
Bernd Klein
https://www.python-course.eu/confusion_matrix.php 6/6