Professional Documents
Culture Documents
KNN For Nilm Fitra Acm
KNN For Nilm Fitra Acm
KEYWORDS
Non-Intrusive Load Monitoring (NILM), k-nearest neighbour
(KNN), disaggregation, Real Power and Reactive power, Accuracy,
Precision, Recall. Where P(t) is the total power consumption reading (aggregated)
while Pi is the power of each device that is on [4]. Fig. 1 below
shows total power consumption chart of 3 appliances which
1 Introduction
shows similar real power overlapping from several conditions. In
Energy management is one of the topics that is developing addition to solve the problem of similar load characteristics in
today, this is one solution to the problem of energy shortages due NILM, in this paper, answers are also sought on what data, both
to an increase in the number of industries and population where
APCORISE’20, June, 2020, Depok, West Java, Indonesia F. Hidiyanto et al.
training data, and test data also at what distance method, as well Table 1. Classification of NILM disaggregation
as what k values will provide maximum accuracy.
2.2 KNN Methods In this paper, we use 9 distances which is the best in terms of
accuracy of 8 major distance families that contain a total of 54
K-nearest-neighbor (KNN) is one of machine learning kinds of distances [7]. 9 distances used in this paper are :
algorithms used to classify new objects based on a number of their
closest neighbors. This algorithm is relatively simple and is a) Manhattan (MD), has the following formula:
classified as supervised learning, lazy learning algorithm, and
instance-based or memory-based learning [6]. The K-NN
classification algorithm that has been done is as follow:
b) Lorentzian distance (LD), formulated as follows:
Prepare data training and testing data from NILM data
Choose K value
Count distance value (d) for each data test to data
training c) Canberra distance (CanD), formulated as follows:
Sort d from smallest to biggest
Classify test data based on k closest neighbor
d) Clark distance (ClaD), formulated as follows:
Where:
i) Euclidean distance (ED), commonly used method: Figure 5. TV and Entertainment Usage Chart
Table 2. Accuracy for Real power only 100% test data and 10%
- 70% training data for k value majorities is 3.
Tabel 5. Accuracy of Real & Reactive power 100% test data and
10% - 70% training data
For cross-validation test with Real & Reactive Power data, you
can see the accuracy results in fig. 9 for training data 10.33% -
81.09% which shows that 21% training data achieved the highest
It can be seen that the accuracy value increases with increasing Accuracy value at 72.69%, for k value 14 so it is very beneficial
of training set data as shown in graphic at fig. 8 below. with small data that can produce high accuracy.
From table 7 Precision and Recall above, it can be seen that the From tests results in only real power data with variations in
value of precision and recall has a maximum value when the the amount of training data from 10% to 70%, for testing the initial
accuracy is maximum, namely, the Euclidean distance with 21% data 100%, obtained highest Accuracy / RR value at 72.61% in 70%
training data with a precision value of 0.7659 and a recall of 0.7606, training data with Manhattan, average and Euclidean distance for
can be compared with the results on the Real power only data test k = 3 , whereas when tested with cross-validation data which test
which is only 0.5593 precision and 0.5637 recall. data are different and not the same as training data, obtained
highest Accuracy / RR value is 54.93% with 81% training data for
It can be conclude that the results of disaggregation in Real Average, Euclidean, Hassanat, Lontzian, Manhattan distance with
Power and Reactive Power feature data get more than 20% higher k = 20.
in accuracy, precision and recall compared to disaggregation only
in real power feature only, both test on 100% data or cross- The interesting thing in disaggregation test results for all
validation test. distance on 100% initial data, the accuracy value will increase with
the number of training data addition with k value 1, but for KNN
disaggregation test results on cross-validation data where the test
4 Conclusion data is different and not similar compared to the training data, it
is obtained for maximum Accuracy for KNN with features are real
Power data & reactive power is 20% training data, with 72.69%
The k-nn method for disaggregating NILM data that has
accuracy for k = 14, while for KNN with feature only Real Power
similar load characteristic will give much better results when
data, maximum Accuracy is obtained at 81% training data with
adding other data as a differentiator, which in this research is
54.93% accuracy for k = 20, but only a small difference with
reactive power data which will give different values for some
training data 10.33% with 54.77% accuracy at k = 24. In terms of
equipment even with the same real power values with the result
processing time, it will be more efficient with small training data
of more than 20% accuracy compared to the disaggregation
but produces maximum accuracy or only a little difference
method with only real power value. Of the 9 methods tested, the
compared to KNN with large training data.
Average, Canberra, Clark, Divergence, Euclidean, Hassanat,
Lorentzian, Manhattan and Squared Chi-Squared methods all give
good accuracy and vary for each amount of training data.
REFERENCES
From tests results with variations in the amount of training
[1] Sigit. T. A., Abdul. H.(2018)."Steady State Modification Method Based On
data from 10% to 70%, the results are with more training data, Backpropagation Neural Network For Non-Intrusive Load Monitoring
recognition rate will be better for testing the initial 100% data (NILM)".MATEC Web of Conferences 218, 02013 (2018). ICIEE 2018.
which obtained highest Accuracy value at 95.06%, precision 0.957 https://doi.org/10.1051/matecconf/201821802013
[2] J. Z. Kolter, M. J. Johnson.(2011). "REDD: A Public Data Set for Energy
and recall 0.9568 in 70% training data with Manhattan and Disaggregation Research". In proceedings of the SustKDD workshop on Data
Lontzian distance for k = 1 , whereas when tested with cross- Mining Applications in Sustainability, http://redd.csail.mit.edu/
[3] Antonio R., Alvaro H., Jesus U., Maria R. & Juan G.(2019)."NILM Techniques for
validation data which test data are different and not similar as Intelligent Home Energy Management and Ambient Assisted Living: A
training data, highest Accuracy value obtained is 72.69% with 21% Review". Energies 2019, 12, 2203; doi: 10.3390/en12112203.
www.mdpi.com/journal/energies.
training data for Euclidean distance with k = 14. [4] G. W. Hart, “Nonintrusive appliance load monitoring,” in Proceedings of the
IEEE, 1992, vol. 80, no. 12, pp. 1870–1891.
The results of disaggregation with the K-nn algorithm with [5] M. Zhuang, M. Shahidehpour, and Z. Li. (2018). "An Overview of Non-Intrusive
Load Monitoring: Approaches, Business Applications, and Challenges". 2018
additional reactive power have very good results up to 95% International Conference on Power System Technology (POWERCON).
accuracy, 0.957 precision and 0.9568 recall value for POWERCON2018 Paper NO. 201804270000624.
[6 ]Rifkie P. (2018). "Belajar Machine Learning, Teori dan Praktik". Penerbit
disaggregation tests on the initial 100% data with 70% training Informatika.
data from the initial data, while disaggregation with the K-nn [7] V. B. S. Prasatha, H. A. A. Alfeilate, A. B. A. Hassanate, O. Lasassmehe, A. S.
algorithm without additional reactive power have around 72.61% Tarawnehf, M. B. Alhasanat, H. S. E. Salmane.(2019)."Effects of Distance
Measure Choice on KNN Classifier Performance - A Review". Big Data. Volume:
accuracy, 0.7654 precision and 0.7382 recall value on the initial 7 Issue 4: December 16, 2019.221-248. http://doi.org/10.1089/big.2018.0175.
100% data with 70% training data from the initial data. [8] Makonin, Stephen. (2016). "AMPds2: The Almanac of Minutely Power dataset
(Version2)",
"https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FI
The results of disaggregation by KNN method with additional E0S4"
[9] C. C. Yang, C. S. Soh, V. V. Yap.(2017)" A systematic approach in appliance
reactive power values with the cross-validation method which disaggregation using k-nearest neighbours and naive Bayes classifiers for
test data are different and not the same as training data, have an energy efficiency ". Springer Science+Business Media B.V. 2017.
up to 72.69% accuracy, 0.7659 precision and 0.7606 recall value [10] I. Abubakar, S. N. Khalid, M. W. Mustafa, H. Shareef and M. Mustapha.(2016).
"Recent Approaches and Applications of Non-Intrusive Load Monitoring".
with training data 21%, While the results of disaggregation by ARPN Journal of Engineering and Applied Sciences. VOL. 11, NO. 7, APRIL
KNN method with data only real power with cross-validation 2016. ISSN 1819-6608. http://www.arpnjournals.com.
methods have an up to 54.77% accuracy, 0.5593 precision and
0.5637 recall value for training data 10.3%.