You are on page 1of 39

SCHOOL OF POST GRADUATE

STUDIES
FEDERAL UNIVERSITY LOKOJA
KOGI STATE

GROUP ONE (1)

DEPARTMENT: COMPUTER SCIENCE

FACULTY: SCIENCE

COURSE CODE: CSC 832

COURSE TITLE: MACHINE LEARNING

OUTLINE
Review at least 25 most recent papers (2020-

2022) on the State of the Art in Tree-based model

solutions; and write a 17 page review article.

GROUP MEMBER
S/N NAME MATRIC NUMBER

1 ADEBAYO, NANA ONYINOYI PG/MS/20/CSC/001

2 ADEJO, ALOYSIUS OKPANACHI PG/MS/20/CSC/002

3 AKAGWU, JOHN OSMAN ATTADOGA PG/MS/20/CSC/003


4 ALI, MONDAY MUHAMMED PG/MS/20/CSC/004

5 ALIYU, ILIYASU EGENE PG/MS/20/CSC/005

LECTURER: DR EMEKA OGBUJU


INTRODUCTION

Merriam-Webster dictionary posit a model to entail, a system of

postulates, data, and inferences presented as a mathematical description

of an entity or state of affairs or better used as, an example for imitation

or emulation.

Tree-based model is what we would be looking into in this work and it is

imperative to begin this work by adopting that, Tree-based classification

models are a type of supervised machine learning algorithm that uses a

series of conditional statements to partition training data into subsets.

Each successive split adds some complexity to the model, which can be

used to make predictions. The end result model can be visualized as a


roadmap of logical tests that describes the data set. Decision trees are

popular for small-to-medium-sized data sets because they are easy to

implement and even easier to interpret. (K.C Lee, 2020).

Admittedly, Tree-based model was brought about to serve as an answer

to questions and needs, how well has this model evolved and how well

has it been utilized? This work would definitely reveal these and more.

A number of articles relating to tree-based model would be reviewed in

this work and they include the followings.

Privacy Preserving Vertical Federated Learning for Tree-based

Models by Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen,

Beng Chin Ooi, 2020.

Overtime, people raise alarm on data theft as a result of easy access and it

is in this light that, the writers of this article discovered the inadequacies

of the popular horizontal Federated Learning. Existing work on Federated

Learning has mainly focused on the horizontal setting which assumes that
each client’s data have the same schema, but no tuple is shared by

multiple clients. In practice, however, there is often a need for vertical

federated learning, where all clients hold the same set of records, while

each client only has a disjoint subset of features (Yuncheng Wu et al,

2020:1).

This article identified two possible privacy leakages: the training label

leakage and the feature value leakage, regarding a target client’s training

dataset. The intuition behind the leakages is that the colluding clients are

able to split the sample set based on the split information in the model

and their own datasets (Yuncheng Wu et al, 2020:7).

In the article, pivot was proposed as privacy protection best for financial

risk management. A linking was created with Decision tree-based model

and pivot.

System and threat model were used as solution overview. Conclusively,

the experimental results demonstrated that Pivot achieves accuracy


comparable to non-private algorithms and is highly efficient (Yuncheng

Wu et al, 2020:16).

A Tree-based Machine Learning Model for Go-around Detection and

Prediction by Imen Dhief, Sameer Alam, Chan Chea Mean and Nimrod

Lilith, 2021.

The writers of this paper discovered the risk associated with the taking off

and landing of airplanes and sought to provide ways that would make it

less risky. They talked about the important roles played by Air Traffic

Controllers in Go-around operations and proposed that, a data-driven and

machine-learned safety metric to assist tower ATC with increased

situational awareness. In particular, the paper developed a machine

learning prediction model for go-around events when an aircraft is in its

final approach phase (Imen et al, 2021:1).

It was discovered in the paper that, the developed algorithm for go-

around labeling is able to correctly identify 731 go-arounds, among


which 93 flight change their runway following a go-around. This shows

that the runway change is relatively frequent when performing a go-

around, as it represents around 13% of the total number of go-arounds

(Imen et al, 2021:6)

Two types of experiments were conducted; the first include down-

sampling techniques and the second include the full data set.

Having seen good discoveries and propositions front this paper however,

it is noteworthy to state the writers of paper failed to realize that

digitalization though good also have some limitations in which Data can

be corrupted when there is power failure or viruses.

forgeNet: a graph deep neural network model using tree-based

ensemble classifiers for feature graph construction by Yunchuan Kong

and Tianwei Yu, 2020.

This paper began with stating the challenges involved in the study of

bioinformatics, it was said that one important problem is the prediction of


clinical outcomes using profiling datasets with a large number of

variables such as gene expression data, proteomics data and

metabolomics data. They said that, in such datasets, major challenges lie

in the relatively small number of samples compared to the large number

of predictors (genes/proteins/metabolites), namely the ‘n - p’ issue

(Yunchuan and Tianwei, 2020:3507)

Having identified the problem, the authors of the paper sought to address

these issues, they developed a method that does not rely on a given

feature network, yet can still benefit from the idea of building a model

with sparse and informative flow of information. Instead of using known

feature graphs, we tried to construct a feature graph within the feature

space. They used a supervised feature graph construction framework

using tree-based ensemble model (Yunchuan and Tianwei, 2020:3508).

The simulation study in the paper proved the forgeNet a powerful

classifier, with reasonably good feature selection ability. Through the


experiment results, one can easily conclude the novelty of forgeNets is

that, by borrowing the neural net architecture of the original GEDFN,

forgeNets utilize feature information more effectively in classification

tasks compared to regular tree-based ensemble methods (Yunchuan and

Tianwei, 2020:3511).

FIST: A Feature-Importance Sampling and Tree-Based Method for

Automatic Design Flow Parameter Tuning by Zhiyao Xie, Guan-Qi

Fang, Yu-Hung Huang, Haoxing Ren, Yanqing Zhang, Brucek Khailany,

Shao-Yun Fang, Jiang Hu, Yiran Chen, Erick Carvajal Barboza, 2020.

In this paper, it was agreed that Modern industrial chip design flows are

immensely complex. A design flow might have multiple steps, each step

might have multiple functions and each function can be configured with

many parameters. Consequently, industrial flows may have hundred-

thousand lines of scripts and are configured with thousands of parameters

(Zhiyao et al, 2020:1).


The authors also postulated that Changing logic synthesis parameters can

result in 3X difference in power and more than one clock cycle difference

in slack. Industrial design teams will tune flow parameters as best as they

can. Flow parameters are usually tuned manually based on designers’

experiences. Because industrial design flows would take several hours or

days to run on large designs, the manual parameter tuning process can be

very time-consuming, especially for novice designers. Consequently,

design turnaround time is stretched long or design quality is

compromised with an inadequate exploration of parameters.

In this paper, Zhiyao et al proposed a Feature-Importance Sampling and

Tree-Based (FIST) method to conduct design flow parameter tuning.

FIST learns the impact of parameters from previously well explored

designs and fully utilizes such information in its sampling process.

In the course of this study, Zhiyao et al built a large dataset, from which

they developed a clustering-based method to leverage prior data to


improve sampling efficiency during exploration. Also, there was an

introduction of approximate sampling and dynamic modeling based on

semi-supervised learning and bias-variance trade-off principles. This

approach improves design quality significantly or requires much less

sampling cost to achieve a given design performance compared with prior

exploration methods.

Having examined the approach and experiments in this work, their

submissions are highly appraised as it is best to be used.

Functionalization of Remote Sensing and On-site Data for Simulating

Surface Water Dissolved Oxygen: Development of Hybrid Tree-Based

Artificial Intelligence Models by Tiyasha Tiyasha, Tran Minh Tung,

Suraj Kumar Bhagat, Mou Leong Tan, Ali H. Jawad, Wan Hanna Melini

Wan Mohtar, Zaher Mundher Yaseen, 2021.

Tiyasha et al started this paper by looking at Water Quality, represented as

WQ, the maintenance, effects of bad Water Quality and looked further to
ways of solving this problem. In their words, the challenges of water

quality brought attention to the researchers the need of developing

systems for WQ monitoring at the global and regional levels to ensure a

proper prediction of sudden changes that help in the efficient

management of water resources. However, the conventional techniques

such as laboratory evaluation of surface WQ is a time-consuming and

costly process. Hence, mathematical models have been developed as an

alternative solution for WQ parameter estimation. However,

mathematical models contain several limitations, such as lower

inadequate modeling performance, non-effective generalized

methodologies, and difficulty in addressing high data uncertainty and

stochasticity. As such, advanced soft computing machine learning (ML)

algorithms have been developed to enhance the prediction of WQ

parameters. ML algorithms are developed from statistical methods that

can automatically learn from data and build a


detection/classification/estimation model that reduces the variation

between the training and prediction dataset without the need for explicit

programming (Tiyasha et al., 2021).

These authors believe that WQ monitoring using advanced techniques

such as artificial intelligence (AI) and remote sensing can help in the

taking of appropriate measures to mitigate the harmful effects of water

pollution.

XGBoost model was utilized in this work because they agreed that it is

mostly considered because of its flexibility in hyper-parameters tuning by

soft computing techniques.

It is highly commendable to see this research end by stating that, the use

of the data for trend and statistical analyses was not mentioned in this

paper. In particular, the use of remote sensing data has the advantage of

providing more knowledge about the selected site characteristics, climate

change, and meteorological variation events, although these variables


may not be useful during modeling. Future research can explore more

remote sensing data with the aim of reducing monitoring site data

utilization without compromising model efficiency and accuracy, thus

reducing the cost of data collection and experiments.

Tree-based nonlinear ensemble technique to predict energy

dissipation in stepped spillways by Ömer Ekmekcioğlu , Eyyup Ensar

Başakın & Mehmet Özger, 2020.

The authors of this paper began by saying the main purpose of the

spillways is to ensure the transmission of flood flow without causing

major damages in the upstream and downstream parts and, in doing so, to

provide the hydraulic criteria as well as to keep the cost to a minimum.

They believed that stepped spillways were widely used in the past,

although different flow regimes, turbulence and analysis are complex

(Chanson & Gonzalez, 2005 in Ömer et al, 2020), they are being applied

for the last few decades due to the fact that they considerably reduce the
cost (Rajaratnam,1990 in Ömer et al, 2020). Step spillways maximise

energy dissipation and reduce the length of the stilling basin

(Chanson,1993 in Ömer et al, 2020).

Support vector regression method and K-star algorithm was used and they

said that support vector machine is an optimisation-based ML algorithm

that works with the rule of minimising structural risks (Ömer et al,

2020:4). Ensemble model was used to make findings in the research, they

discovered that, Stepped spillways not only provide gradual energy

dissipation, but also reduce the size of the stilling basin. Although it is

difficult to examine the stepped spillways in terms of hydraulics, they are

commonly preferred structures due to their high energy dissipation.

An Evaluation of Preprocessing Steps and Tree-based Ensemble

Machine Learning for Analysing Sentiment on Indonesian YouTube

Comments by A. S. Aribowo, H. Basiron, N. S. Herman, S. Khomsah.

Just as the article is titled, the research was in a bid to look through the
comments made on YouTube and use the Ensemble model to find out

how to correct the wrong use of words and emoticons.

At the end, the authors realized that the comments can be improved upon

with proper command of words.

Novel Ensemble Tree Solution for Rockburst Prediction Using Deep

Forest by Diyuan Li, Zida Liu, Danial Jahed Armaghani, Peng Xiao and

Jian Zhou, 2022.

Worthy of note it is to state that the incessant occurrences of rockburst

made these authors to look into this topic. Rockburst is known to be the

unexpected release of strain energy in the process of after excavation of

engineering works underground. These authors used the ensemble tree

model in their study and saw the extent to which the model can be

utilized in engineering works.

It was discovered in this article that smaller databases limit their findings.

They concluded by sharing this thought, Deep Forest, a novel tree-based


ensemble model, was proposed to build the rockburst prediction model

based on 329 collected real rockburst cases. Bayesian optimization was

used to turn the hyperparameters of the DF. The DF had 100% accuracy

in the training set and 92.4% accuracy in the testing set, and it performed

better than other ML models and can forecast massive rockburst disasters

(Diyuan et al, 2022).

Long-Term Wind Power Forecasting Using Tree-Based Learning

Algorithms by AMIRHOSSEIN AHMADI, MOJTABA NABIPOUR,

BEHNAM MOHAMMADI-IVATLOO, ALI MORADI AMANI,

SEUNGMIN RHO AND MD. JALIL PIRAN, 2020.

This article looks into the possibility of having a long-term wind power

forecasting as against the conventional short-term wind power forecasting

that we see around. Since wind energy is one of the green energy that the

world is looking into, this research becomes very important. In this

article, it was noted that, the forecasting accuracy of the proposed models
was investigated using observations measured at various heights and time

intervals.

AMIRHOSSEIN et Al discovered that, the quantity and quality of the

dataset have a profound impact on the performance of the Water Power

Forecasting model. They submitted that the uncertainty of the wind

nature makes collection of sufficient representative datasets difficult.

The generalized minimum spanning tree problem: An overview of

formulations, solution procedures and latest advances by Petrică

C.Pop

This paper started by making us conscious of the fact that the minimum

spanning tree problem is not a thing of today as it was first talked about

in 1926 by Boruvka. This author said that the minimum spanning tree

MST is famous because of the efficient solutions which make it practical

to solve when dealing with large graphs.

In this article, General Minimum Spanning Tree Problem (GMSTP) was


said to be the focus of the research and two integers of programming

were formulated; the local-global formulation and the multigraph

formulation, they are said to be specially tailored to the investigated

problem. The article was put to conclusions by saying, the best exact

algorithms can only solve relatively small instances and in practice

heuristic algorithms are the preferred solution.

New M5P model tree-based control for doubly fed induction

generator in wind energy conversion system by Mounira Ali, Abdelaziz

Talha, El madjid Berkouk, 2020.

In our world today, wind energyhas been made popular by much talk on

green energy. In this article, it was said that among the different available

wind energy conversion system (WECS), variable-speed wind turbine

generators have attracted much attention because of their high energy

production efficiency and reduced friction and mechanical stress

The authors said, the main objective of this paper was to design a new
M5 control algorithm based on fuzzy logic dataset to improve control

performance of DFIG RSC. They continued by saying that The resulting

algorithm is provided based on simple < if/then > rules. The M5 could

reduce the complexity of fuzzy logic computation in the RSC, thereby

enhancing the control dynamics and robustness and minimizing the

harmonic distortion. The result of the simulation carried out in this article

show that M5P controllers have a smoother start-up and faster dynamics

compared with fuzzy logic.

A Continuous Cuffless Blood Pressure Estimation Using Tree-Based

Pipeline Optimization Tool by Suliman Mohamed Fati, Amgad Muneer,

Nur Arifin Akbar and Shakirah Mohd Taib, 2021.

It was established in the article that High Blood Pressure is an acute

health challenge which poses delicate impact health-wise, this therefore,

make people monitor their health well. Two types of blood pressure

monitoring, aneroid blood pressure and invasive arterial BP measurement


procedure was brought to bear, the former uses cuff while the later is

cuffless. The author used a tree-based pipeline optimization tool (TPOT)

to estimate the blood pressure from photoplethysmogram (PPG). This

paper focused on the extraction of PPG signals to derive key features of

invasive arterial BP.

This article rightly pointed out the gap in their work by stating that

further work can be conducted to explore critical care unit artifacts, such

as line access interruptions in the BP waveform, to assess infection risk

due to CLABSI in both pediatric and adult ICU settings.

TreeCaps: Tree-Based Capsule Networks for Source Code Processing

by Nghi D. Q. Bui, Yijun Yu, Lingxiao Jiang, 2021.

From this article, it is revealed that programmers do not just jump at new

things, they use existing languages and code better before trying to

implement new features or fixing bug. TreeCaps was proposed in the

article as a novel neural network architecture that incorporates tree-based


convolutional neural networks (TBCNN) into capsule networks for better

learning of code on abstract syntax trees.

The paper pushed to be the first to re-purpose capsule networks over

syntax trees to learn code without the need for explicit semantics analysis.

Robust Counterfactual Explanations for Tree-Based Ensembles by

Sanghamitra Dutta, Jason Long, Saumitra Mishra, Cecilia Tilli, Daniele

Magazzeni, 2022.

According to this article, the goal of counterfactual explanations is to

guide an applicant on how they can change the outcome of a model by

providing suggestions for improvement.

This work addresses the problem of finding robust counterfactuals for

tree-based ensembles. It provides a novel metric to compute the stability

of a counterfactual that can be representative of its robustness to possible

model changes, as well as, a novel algorithm to find robust counterfactual


(Sanghamitra et al, 2022).

It is great to say this article pointed out the gap in their work by saying,

though not exactly comparable, but our cost and validity are in the same

ballpark as that observed for these datasets in existing works.

RDTIDS: Rules and Decision Tree-Based Intrusion Detection System

for Internet-of-Things Networks by Mohamed Amine Ferrag, Leandros

Maglaras, Ahmed Ahmim, Makhlouf Derdour and Helge Janicke, 2020.

This paper points out that the ravaging of cyberattacks led to the forming

of team to overcome the attackers. Countermeasures are taken according

to the information obtained regarding the detected attacks from the

detection systems.

This paper proposed a hierarchical intrusion detection system based on

the combination of three different classifiers, namely REP Tree, JRip

algorithm and Forest PA. The proposed model consists of three

classifiers, where two of them operate in parallel and feed the third one.
Ranking top-k trees in tree-based phylogenetic networks by Momoko

Hayamizu and Kazuhisa Makino, 2020.

This paper state that, phylogenetic networks have become popular among

biologists as a tool to depict conflicting signals and also used for

modeling reticulate evolution. In this paper, the writers considered the

top-k support tree ranking problem and provided a linear-delay (and

hence theoretically optimal) algorithm for solving it.

A gradient boosted decision tree-based sentiment classification of

twitter data by S. Neelakandan and D. Paulraj, 2020.

This paper began by stating that the Internet has become the most

imperative source for people to attain information for making decisions

and twitter is one of the most popular and most used in this light. Here,

the proposed system handles the efficient SA of twitter data. This is done

utilizing the GBDT classifier. The proposed technique is executed

utilizing Java.
The author pointed out this gap and which it can be worked on, extension

could be made to test a general thesaurus centered on a common corpus.

Artificial Flora Algorithm-Based Feature Selection with Gradient

Boosted Tree Model for Diabetes Classification by Nagaraj P,

Deepalakshmi P, Romany F Mansour, Ahmed Almazroa, 2021.

This paper pointed out that Diabetes is on the increase in everywhere in

the world and gave out the 3 types; type1 diabetes mellitus, type2

diabetes mellitus and the gestational diabetes mellitus (GDM). This paper

used the artificial flora algorithm- gradient boosted tree, AFA-GBT

Model and AFA-FS Model for its findings, the AFA was adopted for

feature selection and the classification was performed using the GBT

model while the GBT model was superior to other models because it was

highly flexible, offered better classification accuracy and operated on

both categorical and numerical values. The gap in the paper is that there

was still no balance between the numbers of the three types of samples.
Performance Evaluation of Deep Learning-Based Gated Recurrent

Units (GRUs) and Tree-Based Models for Estimating ETo by Using

Limited Meteorological Variables by Mohammad Taghi Sattari, Halit

Apaydin and Shahaboddin Shamshirband, 2020.

This paper talks about the use of water for irrigation and said that there

are various methods to determine plant water requirements, but the

Penman-Monteith (ETo-PM) method presented by the United Nations

Food and Agriculture Organization (FAO) has been accepted as the

standard, since other methods give different results. This method

calculates reference evapotranspiration values using different

meteorological variables. Recently, artificial intelligence—specifically,

machine learning and data mining—has been used to calculate

evapotranspiration (ETo) amounts.

The gap in the study as shown by the authors of the paper is that the

discoveries after this research is similar to other works.


A novel tree-based dynamic heterogeneous ensemble method for

credit scoring by Yufei Xia, Mengyi Niu, 2020.

This paper states that extending credit to customers is a core business of

financial institutions because it brings dramatic profits to stakeholders.

Ensemble model was utilized in this article, the rationale behind

ensemble learning is to integrate the decisions of different algorithms to

acquire a better result relative to relying only on a single algorithm. The

long-lasting effects of subprime crisis highlight the importance of credit

risk assessment tools in developed and developing countries. Due to its

superior performance, the ensemble method has attracted much attention

of researchers in the credit scoring domain.

Decision tree-based user-centric security solution for critical IoT

infrastructure by Deepak Puthal, Mukesh Prasad, 2022.

The world today revolves around the internet, important messages are

shared and discovered far and wide as a result of the internet. This paper
state that proper processing and the correct interpretation of the data

provided by the internet may provide better insights to solve real-world

problems. it states further that traditional IT infrastructures are inherently

hybrid and diversified even though there is a shift, the IT infrastructures

are moving from a hardware-centric to a service-oriented infrastructure.

The authors of this article made a proposition which is the DecisionTSec,

posited to be a user-centric adaptive security mechanizing for IoT-based

critical infrastructure. A decision tree mechanism was adopted and

integrated with the crypto system to set the bar higher for the attackers.

Piracema.io: A rules-based tree model for phishing prediction by

Carlo Marcelo Revoredo da Silva, Vinicius Cardoso Garcia, 2022.

It was observed in this article that more than half of the total scam on

credit cards taking place is made possible by phishing, the malicious mail

scam is also said to be another phishing style.

The methodology of this paper is said to be based on a rules tree


processed by gradual analysis, looking at a structure organized by

semantics and similarity and prioritizing the relevance of the features.

This paper proposed a solution aimed to minimize fraud incidents while

browsing the end-user through the Web, the targeted phishing is closed-

scope fraud, such as spear phishing and Smishing.

Slope Stability Classification under Seismic Conditions Using Several

Tree-Based Intelligent Techniques by Panagiotis G. Asteris, Fariz

Iskandar Mohd Rizal, Mohammadreza Koopialipoor, Panayiotis C.

Roussis, Maria Ferentinou, Danial Jahed Armaghani and Behrouz

Gordan, 2022.

Challenges abound for geotechnical engineers to fully access the site of

Job and go on with, this is why they usually put into use, analytical

methods to check the site locations before they go on with their jobs.

In this paper, the authors state that, Slope stability analysis is a standard

practice in geotechnical engineering employed for the estimation of the


stability of natural or man-made slopes such as embankments of

highways, railways, earth dams, tailings, etc. The analysis of slope

stability mainly involves the calculation of the factor of safety (FOS),

which is defined as the ratio between shear strength and the acting shear

stress (Panagiotis et al., 2022:2).

It is observed in this article that, Artificial intelligence (AI) and machine

learning (ML) techniques have been successfully implemented in the area

of engineering and sciences. This paper also posits a series of models

were constructed to calculate FOS using a standard geotechnical

software.

The authors discovered that the better performance and higher capability

for classification purpose goes to the proposed AdaBoost technique.

Therefore, it can be introduced as a new technique for slope stability

classification with the largest number of matched cases.

It was also well established in the paper that, to propose a new method for
classifying slope stability cases using AI techniques, extensive

investigation is required. Therefore, in order to develop a model for

classifying slope stability, a comprehensive database comprising real

cases must be gathered and utilized (Panagiotis et al., 2022:15).

Ensemble Tree-Based Approach towards Flexural Strength

Prediction of FRP Reinforced Concrete Beams by Muhammad Nasir

Amin, Mudassir Iqbal, Kaffayatullah Khan, Muhammad Ghulam Qadir,

Faisal I. Shalabi and Arshad Jamal, 2022.

The paper observed that, the rapid increase in the population of the world

poses a huge demand for the development of infrastructure; thus, the

production of concrete is considerably increased. The work began by

exposing that cement-based material and concrete is used globally

because of their low porosity and high mechanical strength.

This article presents estimation of the bending capacity of FRP-reinforced

concrete beams. Murad et al was quoted in this article to have developed


a GEP tree-based model for flexural capacity of concrete beams

reinforced with FRP rebars.

INTELLIGENT TREE-BASED ENSEMBLE APPROACHES FOR PHISHING WEBSITE

DETECTION by YAZAN A. ALSARIERA, ABDULLATEEF O. BALOGUN, VICTOR E.

ADEYEMO, OMAR H. TARAWNEH, HAMMED A. MOJEED, 2022.

It is observed that humans now live online as virtually, everything is done

on the internet, as a result of this, unsuspecting users have also been

duped of their valuables. The authors of this article stated that, due to the

absence of standard internet protocol, the unregulated access and

availability of these IT infrastructures create possibilities for internet

threats and attacks.

They posited that, A phishing website involves the utilization of

illegitimate websites and their resources to wrongful acquire sensitive

information from end-users. Biometric data, bank account details, and

other sensitive information are taken from innocent users. In order to

cope with the evolving complexities of phishing websites, machine


learning (ML)-based approaches are used to analyse features retrieved

from websites to evaluate their legitimacy (YAZAN et al.,2022:564).

This article proposed intelligent tree-based ensemble approaches for

phishing website detection. BFTree and NBTree were augmented with

ensemble methods for efficient phishing website detection models (YAZAN

et al.,2022:576).

Title of Article methods Tools techniques


Tree-based ensemble Synthetic and real Forest graph-
forgeNet: a graph datasets embedded deep
feedforward
network
deep neural network

model using tree-

based ensemble

classifiers for feature

graph construction
INTELLIGENT TREE-BASED Naïve Bayes Tree Tree-based
ENSEMBLE APPROACHES (NBTree), Best First Tree ensemble
FOR PHISHING WEBSITE (NBTree)
DETECTION
Dynamic Fault Tree Analysis: Petri Nets for Application of Machine Classical fault trees
state-of-the-Art in Modeling, quantifying DFTs, Learning
Analysis and Tools Markov Models for
quantifying DFTs,
Bayesian Networks for
quantifying DFTs,
Simulation Approaches
for quantifying DFTs,
Modularizations
Approaches for
quantifying DFTs
A Tree-Based Intelligent Bishop simplified Decision Tree (DT)
Slope Stability Technique

Classification under

Seismic Conditions

Using Several Tree-

Based Intelligent

Techniques
A Tree-based Machine ADS-B data Data-driven model
A Tree-based Learning Model

Machine Learning

Model for Go-around


Detection and

Prediction
Tree-baesd non-linear Vector machine (SVM), K- Random Forest (RF-
Tree-based nonlinear ensemble star (K) algorithm and E)
artificial neural networks
ensemble technique (ANN)

to predict energy

dissipation in

stepped spillways
Tree-based learning MW Wind Turbines Decision Tree,
Long-Term Wind Algorithms Bagging, Random
Forest, Boosting,
Power Forecasting Gradient Boosting,
XGBoost
Using Tree-Based

Learning Algorithms
Tree-based Tree-based
Ranking top-k trees phylogenetic networks phylogenetic

in tree-based

phylogenetic

networks
Ensemble tree-based 60% of the database Artificial intelligence
Ensemble Tree- approach were validated first-hand (AI), decision tree
on the remaining 40% (DT) and gradient
Based Approach boosting tree (GBT)
approaches.
towards Flexural

Strength Prediction

of FRP Reinforced

Concrete Beams
M5P model tree-based Matlab/Simulink M5 model tree
New M5P model control software. Simplified

tree-based control

for doubly fed

induction generator

in wind energy

conversion system
Ensemble Tree Solution Bayesian optimization Deep forest
Novel Ensemble Tree

Solution for

Rockburst Prediction

Using Deep Forest


Tree-based sentiment HDFS MapReduce Improved Elephant
A gradient boosted classification Herd Optimization
(I-EHO) technique.
decision tree-based

sentiment

classification of

twitter data
Tree-based Ensemble Metric Counterfactual RobX
Robust Stability

Counterfactual

Explanations for

Tree-Based

Ensembles
Limited meteorological 15 input scenarios, Random forest
Performance variables consisting of model, M5 tree
meteorological variables model; random tree;
including maximum and regression tree
Evaluation of Deep minimum temperature,
wind speed, maximum
Learning-Based and minimum relative
humidity, dew point
Gated Recurrent temperature, and
sunshine duration
Units (GRUs) and simplified.

Tree-Based Models

for Estimating ETo by

Using Limited

Meteorological

Variables
Tree-based method ITC99, 45nm NanGate XGBoost model,
FIST: A Feature- Library dynamic tree
technique
Importance Sampling

and Tree-Based

Method for

Automatic Design

Flow Parameter

Tuning
Tree-Based Intrusion CICIDS2017 dataset and REP Tree, JRip
RDTIDS: Rules and Detection System BoT-IoT dataset algorithm and
Forest PA.
Decision Tree-Based

Intrusion Detection

System for Internet-

of-Things Networks
Tree-Based Capsule Java and C/C++ programs TreeCaps
TreeCaps: Tree-Based Networks

Capsule Networks

for Source Code

Processing
Tree-based Ensemble Converting emoticons Naïve Bayes (NB),
An Evaluation of and handling Support vector
unstructured words machine (SVM),
Preprocessing Steps Decision Tree,
Random Forest, and
and Tree-based Extra Tree classifier

Ensemble Machine

Learning for

Analysing Sentiment

on Indonesian

YouTube Comments
Tree-Based Pipeline PhysioNet global dataset Random forest (RF)
A Continuous and K-nearest
neighbours (KNN)
Cuffless Blood

Pressure Estimation

Using Tree-Based

Pipeline

Optimization Tool
Gradient Boosted tree Three diabetes datasets Artificial flora
Artificial Flora Model Algorithm (AFA)-
Based feature
Algorithm-Based selection, and
gradient boosted
Feature Selection tree (GBT)-based
classification
with Gradient

Boosted Tree Model

for Diabetes

Classification

You might also like