Professional Documents
Culture Documents
Authorized licensed use limited to: Badan Riset Dan Inovasi Nasional. Downloaded on July 12,2022 at 03:46:20 UTC from IEEE Xplore. Restrictions apply.
attribute is wide. Meanwhile, its regression tree
splits the attributes by minimizing the prediction
square error. The advantages of this algorithm
include its flexibility, variable selection and
interaction among variables. Its disadvantages
include the splitting on one variable and the
instability of the tree
III. METHODOLOGY
Fig. 2 Sample of a DT visual representation A. Data Acquisition and Pre-processing
The method of data acquisition performed in this
Tree models that predict an output of class is a study was adapted from [5]. As shown in Figure 4, a
classification tree while tree models that predict an block diagram provides the procedure for the
output of a real number is a regression tree [7, 8]. noncontact pH level indicator.
Likewise, a DT is similar to a long continuous
list of if-else statement [9]. A perfect representation
of the process can be visualized through a sample
tree diagram in Figure 3.
Authorized licensed use limited to: Badan Riset Dan Inovasi Nasional. Downloaded on July 12,2022 at 03:46:20 UTC from IEEE Xplore. Restrictions apply.
the pH level is stored through a knowledge-based B. Classification Method
system. In this study, pH values were acquired using As the aim of the study was to determine the
a universal pH scale tester similar to Figure 4. effectiveness of a pH level indicator using HSV, a
DT classifier along with a graphical representation
of the final DT was implemented using Python. The
scikit-learn library was used to generate the DT
through the CART algorithm while visualization of
the tree was generated using the graphviz package.
In the data acquisition stage, 10 different HSV were
collected for every pH value. After acquiring the
data, samples were split into two samples: training
Fig. 4 Sample of a universal pH scale tester and test samples. The train samples comprised of 7
train values per pH level while 3 for the test values
The data gathered from the noncontact pH level
per pH level. Overall, the entire data set has a total
indicator determined the HSV as input together with
of 1,410 samples.
their corresponding pH value as output. The pH
level value ranges from 0 to 14 with increments of IV. RESULTS AND DISCUSSION
0.1 [10, 11]. In this study, Table 1 provides a sample In this study, a total of 1,410 samples were used
of the training dataset for the DT while Table 2 for with 10 HSV per pH level. The feature data was
the test dataset to be used. divided into two model sets: training and testing.
TABLE I. TRAINING DATASET The recorded data was used in getting the system’s
accuracy. Moreover, the DT model was exported in
H S V pH
179.5 222.1 238.9 0.0 graphviz, a segment of which is represented in
179.2 222.8 238.3 0.0 Figure 5.
179.7 223.3 238.8 0.0
179.0 224.6 238.2 0.0
179.1 222.6 239.4 0.0
179.3 223.1 238.7 0.0
179.8 223.1 238.6 0.0
0.4 223.6 239.4 0.1
0.4 224.2 239.0 0.1
0.2 223.3 238.9 0.1
0.3 224.5 238.7 0.1
0.5 221.4 238.0 0.1
0.3 221.9 239.5 0.1
0.1 222.1 239.2 0.1
1.3 223.1 239.7 0.2
1.6 221.7 239.1 0.2
0.8 223.5 238.7 0.2
1.7 224.5 238.3 0.2
Authorized licensed use limited to: Badan Riset Dan Inovasi Nasional. Downloaded on July 12,2022 at 03:46:20 UTC from IEEE Xplore. Restrictions apply.
higher false positives. Lastly, the F-measure shows [10] J. C. Puno, E. Sybingco, E. Dadios, I. Valenzuela, and J. Cuello,
“Determination of soil nutrients and pH level using image
the balance between precision and recall. Moreover, processing and artificial neural network,” IEEE 9th Int. Conf.
this metric considers both the false positives and Humanoid, Nanotechnology, Inf. Technol. Commun. Control.
negatives. Environ. Manag, pp. 1–6, 2017.
[11] B. Building, L. Terrace, U. Kingdom, and I. Systems, “Decision
V. CONCLUSION AND RECOMMENDATIONS tree learning based feature evaluation and.”
Authorized licensed use limited to: Badan Riset Dan Inovasi Nasional. Downloaded on July 12,2022 at 03:46:20 UTC from IEEE Xplore. Restrictions apply.