Professional Documents
Culture Documents
Joel Mao
22 December 2022
Abstract
manner. Especially for machine learning models, which often consist of predictions and true
values that are difficult to appropriately display with conventional data visualization methods,
we examine the steps to create a confusion matrix, as well as the challenges incorporated.
Table of Contents
I. Introduction
IV. Conclusion
References
Mao 2
I. Introduction
Data visualization is and has been used by a wide variety of people, ranging from
businessmen and salespeople to scientists and researchers. As such, it plays a vital role in the
communication of data points, as well as a summation of the conclusion and results that can be
drawn from such data. In this investigation specifically, one data visualization technique in
particular, confusion matrices, are examined in depth. Confusion matrices deal with the
predictive and true value of any predictive model, most common being machine learning models.
These models function as making predictions, and being checked with confirmed sets. Confusion
matrices take these predictions and place them into a table with true predicted values, and allow
for a visualization of the correct and incorrect predictions, allowing the creator to understand
II.Literature Review
Confusion matrices are n x n matrices that consist of several different classes, each with
its own column and row. The X and Y axis may be labeled according to the user preferences, but
one axis is typically the predicted classes, while the other is the true classes. In order to create the
confusion matrix, there are several steps needed, with multiple classes having different scenarios
First understanding the classification of data allows users to format the foundation
necessary to set up the confusion matrix and its different result labels. There exists four labels:
True Positives, True Negatives, False Positives, and False Negatives. True Positives are where
the model correctly predicts the “positive” class (Google), while True Negatives are accordingly
the converse, where an outcome where the model correctly predicts the “negative” class
Mao 3
(Oracle). On the contrary, False Positives are were the model incorrectly predicts the “positive”
class (Google), and False Negatives are where the model incorrectly predicts the “negative” class
(Oracle).
environment, such as Python or Java. Most commonly, confusion matrices created in Python and
Java can have incorporated metrics (Brownlee), which means that there are prebuilt functions
and modules written by others that can be utilized for personal uses. This makes it much easier to
create matrices off of prebuilt “templates” that simplify the process. Typically the standard for
data science and confusion matrices are Python scripts, according to Oracle, due to the flexibility
of Python in conforming data to a user's preferences. In addition, Sklearn and other packages
from Python have made the coding and scripting relatively simple (Shin), with not much
One of the largest enticing elements of confusion matrices is that there are a variety of
available metrics to evaluate the performance of the model through the confusion matrix. The
most common metric is accuracy, which is the general accuracy of the whole model, or the
fraction of the total predictions that the model correctly predicted (Mohajon). Another common
metric is precision, which focuses specifically on the positive class, and determines the fraction
of the predicted positive class that were truly positive (Mohajon). Some smaller but still
important metrics are recall, or the portion of the true positive class that were correctly predicted
as positive (Vidiyala), and F1, which combines both the precision and the recall scores,
measuring the harmonic mean of the two metrics and is determined as a percentage, with 1
A wide range of applications can be used for confusion matrices, especially towards
machine learning uses. One of the largest applications lies within classification problems, such as
predicting for population genetic variants (Indeed). Classifying different items, and
understanding the abilities of predictive models for specific classes allows for determination of
which classes are better or worse for classification of models. Other applications lies in cancer
patient modeling in healthcare, and understanding whether the models are effective (Shin), as
well as business model predictions, and predicting whether client will purchase or not
(Hernandez).
Variables
Procedure
Testing will be conducted by the process of building a complete confusion matrix based
on sample data given from a machine learning model, and evaluating the process and steps taken
to complete the entire process. In addition, perhaps the most valuable part of the experiment will
be determining the difficulties posed with using and creating a custom confusion matrix, whether
it be inherent issues with the functionality of a confusion matrix, or with available resources and
information contained within, and if they prove thorough enough to successfully guide through
the process without issues. The data used will be a sampled list from a Javascript file, containing
Data Collection
IDEs Available:
● Jupyter Notebook **
● Google Colaboratory
● Pycharm
● https://www.w3schools.com/python/python_ml_confusion_matrix.asp
○ Some issues may be in attempting to increase the number of classes, and seeing
how the confusion matrix is able to expand and populate these new classes with
values.
● https://stackoverflow.com/questions/2148543/how-to-write-a-confusion-matrix
○ Uses Scikit Learn, which is the most efficient and easy to use confusion matrix
creator
Cell Programming
Mao 6
properties. The Pandas was necessary for formatting the data frame, and the data was stored in a
json file, importing the json module to successfully transfer the dataframe. The Matplotlib was
used for the visual confusion matrix, while Sklearn was imported for the various confusion
This cell displays the format of the data. This specific data file was formatted to be a dictionary
style, with there being 4 levels to the dictionary, with Classes, Predictions Class, Categories of
Predictions, and numerical Prediction Values. This process was rather simple, as it involved
This portion was the most complicated, with issues arising as to how to access the
different areas of the dictionary. The intention was to access each individual result of the
prediction for each class. However, with the dictionary format of the data, it was rather difficult
to access each definition and sub definition. For example, the class of “Ground Motor Vehicle”
would have a Predictions subclass with its own definition, possessing 9 elements of its own
defining the Predictions the model made of what it thought the “Ground Motor Vehicle” was.
This was solved by organizing the different subclasses of the data, and understanding which
variables were changeable and which were necessary to remain static. Because each of the
prediction values for each class were in an array within the dictionary, it became clear that in
order to collect each numerical value of the prediction, a for loop could be used to note each
individual change in value. In addition, these values would be collected and placed into a
numerical matrix, with the name cm, denoting its future usage within the formal
confusion matrix.
This final cell allowed for the gathering of all the information and the placement into the
actual confusion matrix. As can be seen in the image above, the confusion matrix has nine
different classes, with each class occupying a row and a column. The rows would denote the true
objects, while the columns represent the predictions made by the model. This part of the coding
was relatively simple, as it used a short line of script that took in the premade confusion matrix
data array, and converted it into a more formal colored confusion matrix, revealing the
Rationale
The IDE chosen to use in creating the confusion matrix was Jupyter Notebook. This IDE
was chosen due to the strength and widespread use within the coding community. Jupyter
Notebook is also commonly used for creating and testing machine learning models. Due to its
various “cells”, it can allow for organization as well as sectioning off of different areas of code to
identify the possible functioning or malfunctioning areas, as well as simplifying the process of
For this experiment, it was crucial for a complete documentation of every step that was
coded. There were a variety of different functions that needed to come together in order to create
the confusion matrix, and by researching the specifics of using a Python dictionary, as well as the
Analysis
Examining the difficulties with coding the program, there were several issues, with some
being resolvable with an online search, but others requiring some intuition with finding
solutions. There was a relatively large amount of resources available for creating confusion
matrices in Python, likely due to the rising usage of machine learning and predictive models as
used in the confusion matrix made in this experiment. However, one major issue lacking in
online resources was the conversion of the dictionary stored in the json file. It became somewhat
challenging converting the file into readable format, as well as understanding how to exactly
access and retrieve information from the dictionary. This issue can be resolved with some level
of intuition, with testing the different levels with a print function, and then accessing the
Besides this initial challenge, the programming of the confusion matrix went rather
smoothly. The most helpful resource was StackExchange, which is an open forum-based website
for coding and programming related questions. With many other programmers on the website
providing feedback and strategies on how to tackle various problems, many solutions have been
suggested, and were thus implemented within the programming of the confusion matrix. This
also assisted with some of the more niche issues that were encountered that could not be
answered by the generic online resources found. One such issue was that many of these online
resources needed to use predicted and true values of the predictions from the classes in order to
format and create the confusion matrix, using the column classes of true objects and row classes
of predicted classes. However, in order to create the confusion matrix from the provided values
and their corresponding classes and predictions, there was a workaround needed, where a
different approach to providing and formatting the confusion matrix data was needed. This was
likely because many of the guides had intended for use and incorporation within the predictive
model code itself, which means that the confusion matrix would be created alongside the results
of the predictive model. However, in this specific case where the results had been provided and
categorized already, it made it somewhat more difficult in the sense that there were few guides
provided on the organization of the data rather than calculation of true/false predictions into a
confusion matrix.
models rising, as well as the overall increase in interest of machine learning models,
understanding how to effectively create, utilize, and analyze confusion matrices has become ever
more important. By going through the process of collecting various resources available to help
new users and programmers overcome their issues with creating confusion matrices, as well as
Mao 10
provide a basis of code for anyone to be able to utilize and adjust for their own functions, the
process has been simplified and compiled for easier usage in the future. Thus, with a
confusion matrix with any style and format of data is well within the bounds of achievability.
IV. Conclusion
This scientific paper discusses the necessary steps leading to the creation of a confusion
matrix, as well as the importance of each step and the confusion matrix as a whole. In order to
have an effective method of analyzing the results of a predictive model, we have looked closely
at the resources available to consumers in creating such a confusion matrix to evaluate the
performance of their own machine learning models. The steps are quite clear and widely
available online, with the classification of labels such as the true-positive and true-negative
labels, as well the transfer to the metrics in the actual measuring and evaluation of the
performances of the models. This analysis of the steps needed to create the confusion matrix
ensures that there is a centralized research conducted on the details of what exactly a confusion
matrix is, as well as the advantages and disadvantages in the process of the confusion matrix.
The next major research direction involves the combination of several different data visualization
techniques to be able to cover all facets and aspects of the data itself.
Mao 11
References
[1] Brownlee, J. (2020, August 14). What is a confusion matrix in machine learning.
https://machinelearningmastery.com/confusion-matrix-machine-learning/
[2] Genesis. “Confusion Matrix and ROC Curve.” From The GENESIS, 26 June 2018,
https://www.fromthegenesis.com/confusion-matrix-and-roc-curve/.
[3] Google Developers. “Classification: True vs. False and Positive vs. Negative | Machine
https://developers.google.com/machine-learning/crash-course/classification/true-false-
positive-negative#:~:text=A%20true%20positive%20is%20an,incorrectly%20predicts
%20the%20positive%20class.
[4] Hernández, Pablo. “Mine Is Better: Metrics for Evaluating Your (and Others) Machine
evaluating-machine-learning/.
[5] Indeed Editorial Team. (2022, October 3). What is a confusion matrix? (plus how to calculate
https://www.indeed.com/career-advice/career-development/confusion-matrix
[6] Manliguez, Cinmayii. (2016). Generalized Confusion Matrix for Multiple Classes.
10.13140/RG.2.2.31150.51523.
Mao 12
[7] Mohajon, Joydwip. “Confusion Matrix for Your Multi-Class Machine Learning Model.”
https://towardsdatascience.com/confusion-matrix-for-your-multi-class-machine-learning-
model-ff9aa3bf7826.
[8] Ng, Andrew, and Kian Katanforoosh. “Advanced Evaluation Metrics.” Section 8 (Week 8), 1
[9] Shin, Terence. “Understanding the Confusion Matrix and How to Implement It in Python.”
https://towardsdatascience.com/understanding-the-confusion-matrix-and-how-to-
implement-it-in-python-319202e0fe4d.
[10] Tayabali, S. (2020, December 11). A simple guide to building a confusion matrix.
datascience/post/a-simple-guide-to-building-a-confusion-matrix
[11] Vidiyala, Ramya. “Confusion Matrix in a Nutshell.” Medium, Analytics Vidhya, 25 May
2020, https://medium.com/analytics-vidhya/a-z-of-confusion-matrix-under-5-mins-
147c1b4467ab.