You are on page 1of 4

220209177

Comparing Between Machine Learning Classification Models on Performance


and Accuracy.

Introduction:

The MAGIC gamma telescope dataset is Monte Carlo generated using the CORSICA algorithm. The
telescope captures high energy gamma particles, allowing to discriminate Cherenkov photons
caused by primary grammas from hadronic showers.
The gamma dataset has class labels named ‘g’, and ‘h’, denoting a gamma signal or hadron signal. As
well as 10 additional predictors, all related to the imagine capturing configuration of particles.
The process defines an ellipse; therefore, each predictor is used to discriminate from the ellipse.

A variation of model ensembles was implemented, although, Sci-kit learn provide model selection
algorithms. These models will fine-tune a classifier based on a combination of provided hyper-
parameters. The two searching models used were Grid Search CV, and Halving Grid Search, this
searches over specified parameter values with successive halving. The choice for which specific
learning algorithm is implemented is a critical step (Kotsiantis et al. 2006), therefore, searching
algorithms help determine a precise model. Although, the algorithm may come with limitations,
such as computational processing speed for many parameter models (Bottou et al. 2018.)

Scientific Method
INTRODUCING THE MODEL:
Data Pre-processing:
The “MAGIC Gamma Telescope” dataset was read into a python kernel using python 3.11.1
on a Mac computer. Several ensemble classifiers were implemented, namely Gradient
Boosting, Hist Gradient Boosting, Random Forest, Extra Trees and Bagging Classifiers.

Model selection:
On a limited computer hardware, approximately 180 python machine learning models, were
evenly implemented in parallel. Running on a maximum of 16 workers each, equal to 16
threads, taken out of a total 5250 threads available. Such as random forest classifiers, or
gradient boosting classifiers. These were all applied into a StackingClassifier, to record the
best predicted estimator (highest scoring 0.8512.)
Eventually, this total was compared with two python sci-kit Sklearn modules, on model
selections between two competitors: GridSearchCV and HalvingsGridSearchCV. The metrics
were competed on accuracy, roc and auc, as well as time of computation to complete the
algorithm. A total of 60 models were computed, with 30 between each competitor.
220209177

Model design:
Box and whisker plots of the results from accuracy performance, based on the various data
transformation techniques. Eventually, a MANOVA test between the competitors was
implemented, and tested only with accuracy.
Limitations:
A) Model designs only had the capacity for testing with accuracy, and not roc or roc and accuracy.
Neither did the box or whisker plots, therefore, additional scores and details are missed. These
contribute to scaling the data, rather than centring.
B) Model selection for replacing StackingClassifier with VotingClassifier, in our study had lacked a
sufficient score, which was less than StackingClassifier. Additionally, only a few hyper parameters
were available for each model, until computational unreadiness happened by stack overflow,
therefore the maximum number of hyper-parameters necessary were carefully selected for every
model.
Additionally, the model implementation was slow, and so attempts at multiprocessing to parallelize
the model computation was implemented.

Scientific Result
INTRODUCING THE RESULT:

Table 1. Table display the average accuracy score between the two searching classifiers, Grid Search
and Halving Grid Search. These are applied on each PCA transformed dataset.
cvSearch
cvHGrid average
average
accuracy
Method (PCAs) cvSearch time (s) cvHGrid time (s) accuracy
0.577
Incremental 508 210 0.575
0.697 *
Kernel 741 514 0.692 *
0.695
Mini Batch Sparse 585 314 0.686
0.688
Regular 75 106 0.685
0.695
Sparse 664 416 0.688

Kernel PCA dimensionality reduction on scaler transformed datasets, performs best for each
grid searching algorithm (0.692, 0.697) with regards to accuracy. However, its the most
computationally involved, taking 90% more time to compute than PCA for Grid Search. With
only a 0.007 difference in accuracy.
PCA transformations perform the quickest, whereby, only a fraction lower with accuracy
compared to the slowed and marginally better models.
220209177

Incremental PCA has a tighter boxplot


with outliers and performing most poorly.
The top competitors are between the last
four PCAs.
Kernel PCA is performing marginally the
best, however, a minor outlier is present
in the Halving classifier. Additionally, a
questionable outlier is present in PCA for
cvSearch. Both plots show
MiniBatchSparse decomposition
competing with PCA, whereby Sparse is
second in competition to Kernel
decomposition.

Figure 1. Boxplot or Whiskers plot between Grid Search CV or Halvings Grid Search, for finding the
best model hyperparameters. Evaluating these between the transformation method, and median
accuracy.

Kernel PCA computes on the


model the longest. Both
classifiers show an equal
distribution and rank
amongst PCA
transformations. PCA is
the quickest method.

Figure 2. Average time to fit a model to a training test dataset and predict on its test dataset.
Relative performance between transformation methods, and classifier used.

All tests from table 2 succeeded, suggestive that there is no clear significance, in difference
between the methods and its effect on the classifier.

Table 2. MANOVA testing between the classifiers used and PCA methods. Divided between different
hypothesis tests, and their respective F statistics.
Test Score DF Pr(>F)
4.958 4 0.0001499
Pillai
45.697 4 3.829e-11
Roy
7.3165 4 5.56e-13
Hotelling-Lawley
11.341 4 7.555e-09
Wilks
220209177

Scientific Conclusion
INTRODUCING THE DISCUSSION AND CONCLUSION:
Discussion:
Recognised research in this area such as Dadzie and Kawkye (2021), follow a similar
methodology, however, we evaluated across Sklearn most prominent search-based
algorithms for fine-tuning hyper parameters. Our results are weak compared to Dadzie and
Kawkye, which questions the suitability of our models. However, our results show minimal
difference exists between PCA transformations, besides Incremental PCAs. Although,
despite these results, PCA performs the quickest.
In effect, only two principal components were tested, and these were maximised by
explained variance. Unfortunately, no best number of components were selected,
therefore, the results are based on two components. Additional components may improve
the overall result because the models are adjusted by data size.
A noticeable difference is recognised for the time to compute the classifier. Halving
classifier is approximately ~40% faster on average, producing marginally better results,
although no noticeable difference is recognised. Therefore, Halving classifier is an optimal
choice on small datasets.
The MANOVA test also supports the current result, such that little difference is recognised
between the models. A key finding, like Dadzie and Kawkye, therefore a variation in models
likely does not signify better accuracy. Although, performance-wise, some models perform
the same result with better speed.
Limitations:
Method: The method only approached model selection and model transformations. No
investigation into classifier performance was made.
Results: The Models performed far worse than our StackClassifier set-up. Therefore, the
overall results will have been different.
Conclusion:
All but Incremental PCAs performed equally well in metric classification. Therefore, any of these
four models are appropriate decomposition techniques. The same is said of the model selection,
although, for computational efficiency, Halving Grid Search is the appropriate choice to make.
However, a thorough analysis on classifiers and their performance is necessary for future research.
For some classifiers may respond better to the data and model selection.
References:
i. Bottou, L., et al. 2018. Optimization Methods for Large-Scale Machine Learning. SIAM
Review.
ii. Dadzie, E.A., and Kwakye, K.K. 2021. Developing a Machine Learning Algorithm-Based
Classification Models for the Detection of High-Energy Gamma Particles. Computer
Vision and Pattern Recognition. 1-7.
iii. Kotsiantis, S.B., et al. 2006. Machine Learning: A Review of Classification and combining
Techniques. Artif Intell Rev. 159-190.

You might also like