Professional Documents
Culture Documents
2307-Article Text-6924-1-2-20220517
2307-Article Text-6924-1-2-20220517
Jurnal Mantik
Journal homepage: www.iocscience.org/ejournal/index.php/mantik/index
CV. Mitra Karya Sejati was founded in 1998 by Tardi Sutanto as a trading
company in the field of agricultural materials, such as multipurpose fertilizers,
Article history: pesticides and seeds. CV. Mitra Karya Sejati is often in short supply due to the
Received: ... increasing demand for consumer goods and even some items are not sold, so it is
Revised: .... difficult to predict the stock of goods in the warehouse so far based on sales data
Accepted:.... which is used only as data to report monthly sales results which will then not be
used again. . Companies need a sales forecasting system to help you plan stock
items in the future. The application of C 5.0 to data mining can make it easier for
companies to estimate stock of goods based on sales of agricultural materials.
The Data Mining application is built using the PHP programming language and
Keywords: MySQL database to reduce inventory planning losses. Through the proposed
C 5.0, Data Mining, Prediction system can help companies estimate the stock of goods to be provided to
minimize losses.
Copyright © 2021 Jurnal Mantik.
All rights reserved.
1. Introduction
The rapid development of technology at this time makes it easier for people to get various kinds of
information. Almost all Indonesian people, whether for children, teenagers or adults, are involved with
technology every day. The role of technology for now is considered capable of replacing human activities.
With the rapid development of today's technology, the presentation of information effectively and efficiently
is increasingly needed. Where the presentation of information should have shifted from the old manual
recording system to a computerized system [1].
CV. Mitra Karya Sejati Jl. Menggala Sakti Km.24, Kec. Tanah Putih, Kab. Rokan Hilir, Riau is a trading
company that offers agricultural tools and materials at low prices. CV. Mitra Karya Sejati uses a sales
information system to support the company's performance in sales information services. The availability of
agricultural products at low prices is a challenge in itself to maintain a stable stock of goods in the
warehouse. Although it looks easy, managing inventory is not something that can be taken lightly, because if
there is too much inventory, the risk of damage to goods will increase, and vice versa, if there is a small
amount of inventory, the risk of shortage of goods will be higher than inventory and can delay profits and
make consumers disappointed.
Sales of agricultural materials have increased in recent months, so the availability of goods in
warehouses is often short due to the increasing demand for consumer goods and the fact that some items are
not even sold. As happened in June to September in 2021 where there was a shortage of goods in Meroke
fertilizer, and damage to CU Daun fertilizer resulting in losses, this was due to the large number of orders for
certain fertilizers at the company from June to September, so there is difficulty in predicting the stock of
goods in the warehouse. This announcement is very useful in determining how many products to ship next
month. The problem that is often faced by companies is the level of accuracy in predicting future commodity
stocks based on previous sales data. These predictions have a big influence in determining which products
472
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Jurnal Mantik, Vol. 4 , No. 1, May 2020, pp. 472-482 E-ISSN 2685-4236
sell and which don't. This research is based on CV sales data. Mitra Karya Sejati uses monthly sales data only
as data to report monthly sales results, then it will not be used again.
Data mining can be used to extract information from big data to obtain information that can be used to
predict, classify, and estimate. Algorithm method C 5.0 Data mining is one of the classification algorithms
that can specifically be applied to Decision Tree technicians. The C 5.0 algorithm is an improvement from
the ID3 and C 4.5 algorithms [2].
Previous research by Dwi Marisa Efendi and Pakarti Riswanto in a scientific paper entitled "Application
of Data mining for Fertilizer Sales Prediction Using Apriori Algorithm Method" published in JIK (Journal of
Information and Computers) Vol. 9 no. 1, 2021. This study performs calculations using the a priori algorithm
method to determine the amount of fertilizer sold at the Fadilah Semuli Jaya shop [3]. Research by Yogi
Yusuf W., conducted a comparative study of the performance of the Decision Tree C 5.0, CART and CHAID
algorithms with a case study to predict credit risk status at Bank X, the results showed the average accuracy
rate for the C 5.0 algorithm method was 87.72%, CART is 87.27% and CHAID is 87.15% [4]. Research with
the title "Application of Data mining for Prediction of Sales of Medical Devices Using the C 4.5 Algorithm
PT. Murni Indah Sentosa” examines classification techniques applied to sales data analysis, particularly
topics related to medical device sales transactions. Determining the stock of medical devices helps managers
determine sales forecasts for the next period [5].
Based on these phenomena and problems, this study aims to facilitate the supply of goods in the CV.
Mitra Karya Sejati in planning inventory and making it easier for companies to find fertilizer products that
need to be supplied and are in great demand by consumers using the C 5.0 algorithm.
2. Method
2.1 Waterfall
This study uses the waterfall method as a method of developing the system that was built. The stages of
the waterfall method are depicted in Figure 1 below.
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Judul Artike –Nama Penulis
subsequent phases. Each unit is developed and tested for features called unit tests.
4. Verification
In this phase, the system is inspected and tested to see if the system fully or partially meets the system
requirements, the tests can be classified as unit testing (performed on a particular code module), system
testing (to see how the system reacts when all modules are integrated) and acceptance tests ( with or on
behalf of the customer to see if all customer requirements are met).
5. Maintenance
This is the final step in the Waterfall method. Completed software run and maintained. Maintenance fixes
bugs not found in the previous step.
The first step is data processing, namely the data cleaning process which will then be calculated for
Entropy and Gain. Entropy calculation formula is written in equation (1).
m
Entropy (S) = ∑ pi log 2( pi) (1)
i=1
Information:
S : set
A : attribute
n : number of partitions S
474
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Jurnal Mantik, Vol. 4 , No. 1, May 2020, pp. 472-482 E-ISSN 2685-4236
pi : the proportion of Si to S
The second stage is to calculate the Gain which is written in equation (2) below.
Y
Gain (S,A) = Entropy (S) ∑ Dj
D
∗¿ Entropy(Si)¿ (2)
J =1
Furthermore, the calculation of split information is used to select a number of attributes, with the
formula in equation (3) below.
c
Si Si
SplitInfo ( S , A )=∑ ¿ 1 log 2 (3)
i S S
Where S is the number of sample data and Si is the number of each on each attribute.
After getting the results of the split information, the next step is to calculate the Gain Ratio which is
written in the following equation (4).
Gain (S , A)
Gain Ratio= (4)
Split Info( S , A)
The following is a Decision Tree development method by utilizing the C 5.0 algorithm:
1. The tree is first given as a single node, which represents the training set.
2. If the same class appears in all samples, the node becomes a leaf and is labeled with that class.
3. Otherwise, the algorithm will select predictor variables based on entropy-based metrics (Information
Gain) to divide the data into different classes. At that node, this variable becomes a test or decision
variable.
4. For each known value of the test variable, a branch is created, and the sample is partitioned according to
that branch.
5. To create a Decision Tree, the algorithm is repeated according to the procedure.
6. Only one of the following conditions must be met for a recursive partition to terminate:
1) The class of all records on a node is the same.
2) The remaining record characteristics cannot be partitioned again. The majority vote applies in this
situation. The node becomes a leaf node, with most of the current record classes indicated [8].
From the table above, it can be concluded that the highest Gain Ratio lies in the total attribute with a
475
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Judul Artike –Nama Penulis
value of 0.10750226 so that it can be taken as a benchmark in the decision tree node.
In Figure 3 it can be seen that the highest gain ratio is the total, if the total sales are >138 then the sales
are in demand, and if the total is <= 138, the decision is not yet made because the nodes or roots must be
calculated again.
TP
Precision= (7)
TP+ FP
TP
Recall= (8)
TP+ FN
Where :
TP (True Positive) = Amount of Actual Data = 'Selling' and Predicted = 'Selling'
FN (False Negative) = Total Data Actual = 'Selling' and Predicted = 'Not Selling'
476
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Jurnal Mantik, Vol. 4 , No. 1, May 2020, pp. 472-482 E-ISSN 2685-4236
TN (True Negative) = Total Data Actual = 'Not Selling' and Predicted = 'Selling'
FP (False Positive) = Total Data Actual='Not Selling' and Predicted='Not Selling'
TABLE 2
CONFUSION MATRIX
Identified Selling Identified Not Selling
From table 2 above, the accuracy level of the classification can be measured as follows:
TP = 193, FN = 0, TN = 79, FN = 0
Accuracy = ((TP + TN) / (TP+FP+TN+FN)) x 100%
Accuracy = ((193+79) / (193+0+79+0)) x 100%
Accuracy = 100%
477
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Judul Artike –Nama Penulis
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Jurnal Mantik, Vol. 4 , No. 1, May 2020, pp. 472-482 E-ISSN 2685-4236
References
[1] M. Danuri, “Development and Transformation of Digital Technology,” Infokam, vol. XV, no. II, pp.
116–123, 2019.
[2] A. W. Wijayanti, “Analisis Hasil Implementasi Data Mining Menggunakan Algoritma Apriori pada
Apotek,” J. Edukasi dan Penelit. Inform., vol. 3, no. 1, pp. 60–64, 2017, doi:
10.26418/jp.v3i1.19534.
[3] D. M. Efendi and P. Riswanto, “Penerapan Data Mining Untuk Prediksi Penjualan Pupuk Dengan
Metode Algoritma Apriori,” J. Inf. dan Komput., vol. 9, no. 1, pp. 16–21, 2021, doi:
10.35959/jik.v9i1.196.
[4] Y. Yusuf W, “Perbandingan Performansi Algoritma Decision Tree C5 . 0, Cart, dan Chaid: Kasus
Prediksi Status Resiko Kredit di Bank X,” in Seminar Nasional Aplikasi Teknologi Informasi 2007
(SNATI 2007), 2007, vol. 2007, no. Snati.
[5] A. Fikri and W. Verina, “Penerapan Data Mining Untuk Prediksi Penjualan Alat Medis
Menggunakan Algoritma C4.5 Pt. Murni Indah Sentosa,” Infosys (Information Syst. J., vol. 5, no. 1,
pp. 70–82, 2020, doi: 10.22303/infosys.5.1.2020.70-83.
[6] A. A. Wahid, “Analisis Metode Waterfall Untuk Pengembangan Sistem Informasi,” J. Ilmu-ilmu
Inform. dan Manaj. STMIK, pp. 1–5, 2020.
[7] N. Putri, Rizkita Yolanda; Mukhlash, Imam; Hidayat, “Prediksi Pola Kecelakaan Kerja Pada
Perusahaan Non Ekstraktif Menggunakan Algoritma Decision Tree: C4.5 dan C5.0,” Sains Dan Seni
Pomits, vol. 2, no. 1, pp. 1–6, 2017.
[8] Carlis Hutabarat, “Penerapan Data Mining Untuk Memprediksi Permintaan Produk Kartu Perdana
Internet Menggunakan Algoritma C5.0 (Studi Kasus: Vidha Ponsel),” Pelita Inform., vol. 6, no. 4,
pp. 419–424, 2018.
[9] R. N. Amalda, N. Millah, and I. Fitria, “Implementasi Algoritma C5.0 dalam Menganalisa Kelayakan
penerima Keringanan UKT Mahasiswa ITK,” Teorema Teor. dan Ris. Mat., vol. 7, no. 1, pp. 101–
116, 2022.
479
Accredited “Rank 4”(Sinta 4), DIKTI, No. 36/E/KPT/2019, December 13 2019. th
Jurnal Mantik is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).