You are on page 1of 14

1

1 Article

2 Non-Invasive Classification of Blood Glucose Level for Early


3 Detection Diabetes Based on Photoplethysmography Signal
4 Ernia Susana *, Kalamullah Ramli 1, Hendri Murfi 2 and Nursama Heru Apriantoro 3

5 1 Department of Electronic Engineering, Universitas Indonesia, Depok 16424, Indonesia;


6 kalamullah.ramli @ui.ac.id
7 2 Department of Mathematics, Universitas Indonesia, Depok 16424, Indonesia; hendri@sci.ui.ac.id
8 3 Department of Radiodiagnostic and Radiotherapy, Politeknik Kesehatan Kemenkes Jakarta II, Jakarta 12120,
9 Indonesia; nursama_91@yahoo.co.id
10 * Correspondence: ernia.susana01@ui.ac.id; Tel.: +62-878-7054-6698

11 Abstract: Blood glucose levels (BGL) is an important parameter for the early detection of diabetes.
12 A single photoplethysmography (PPG) method for the classification of BGL can automatically
13 analyze diabetes symptoms. Users can immediately know the condition of their BGL to ensure
14 early detection. In recent years, deep learning methods have presented outstanding performance
15 in classification applications. However, there are two main problems in deep learning
16 classification methods: classification accuracy and time consumption during training. We attempt
17 to address these limitations and propose a method for the classification of BGL using the K-nearest
18 neighbors (KNN) algorithm based on PPG. We collected 400 data and divided the subjects into
19 two classification levels, namely normal and diabetes, according to the BGL levels of the
20 World Health Organization (WHO). Hence, it is validated that the proposed method can achieve
21 improved classification accuracy without additional manual pre-processing of PPG. Our
22 proposed method achieves higher accuracy than convolutional neural networks (deep learning),
23 bagged tree, logistic regression, and AdaBoost tree.

24 Citation: Lastname, F.; Lastname, F.;


Keywords: diabetes; blood glucose levels; photoplethysmography; machine learning; k-nearest
25 Lastname, F. Title. Information 2021, neighbors
26 12, x. https://doi.org/10.3390/xxxxx

Academic Editor: Firstname


27 Lastname 1. Introduction
28 Diabetes is one of four priority the non-communicable diseases (NCD) after heart
Received: date
29 disease, stroke, cancer and chronic lung disease which is the leading cause of death in
Accepted: date
30 adults globally. According to The International Diabetes Federation (IDF) 2017 [1], the
Published: date
31 prevalence of diabetes will continue to increase globally. Diabetes is a silent killer
32 Publisher’s Note: MDPI stays
disease that can cause complications such as heart disease, kidney disease, blindness,
33 neutral with regard to jurisdictional
and disability, [2]. Diabetes is divided into two, namely type 1 and type 2. Type 1
34 claims in published maps and diabetes is highly insulin dependent, which is caused by the patient's body not being
35 institutional affiliations. able to produce the hormone insulin. Type 2 diabetes (T2D) is caused by the body's cells
36 becoming less sensitive to the hormone insulin, even though insulin production and
37 levels are normal [3].
38 The probability of type 2 diabetes is greater than that of type 1. Type 2 diabetes
39 Copyright: © 2021 by the authors. reaches 95% of the world's population [4]. Various factors can increase the risk of type 2
40 Submitted for possible open access diabetes, including gender, lifestyle, hypertension, obesity and inactivity, and increasing
publication under the terms and
41 age [5-9]. Management diabetes in T2D is an effort to prevent or obstruct complications
conditions of the Creative Commons
42 and conserve quality of life. It consists of early prevention to risk factors through
Attribution (CC BY) license
43 promotive and preventive efforts by not neglecting curative and rehabilitative efforts.
(https://creativecommons.org/license
44 The optimal management of diabetes makes diabetes controllable and people with
s/by/4.0/).

3 Information 2021, 12, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/information


4 Information 2021, 12, x FOR PEER REVIEW 2 of 14
5

45 diabetes can live long and healthy lives. Type 2 diabetes can be avoided or reduced by
46 better understanding the risks and lifestyle changes [6, 7]. Effective prevention is needed
47 in diabetes management by using regular monitoring of blood glucose levels (BGL) [10].
48 Monitoring System for Early Detection Diabetes is very important to be
49 implemented as early as possible, to avoid very expensive medical costs. Even since
50 childhood and adolescence, good habits such as a healthy lifestyle, exercise, not smoking
51 and stress management need to be introduced as part of diabetes prevention efforts. One
52 effective strategy to reduce the risk of diabetes and other complications is to make an
53 early diagnosis through intervention by controlling BGL independently. Self-Monitoring
54 of Blood Glucose (SMBG) is an approach in which people with diabetes measure their
55 own blood glucose (glycemia) using glycemic monitoring equipment. Based on their
56 readings, they can assess or relate to their condition, habits and treatments.
57 Currently, only invasive methods are commercially available, either blood glucose
58 laboratory tests or glucometers. Both of these methods are injuring the patient's body
59 either when using a syringe or finger prick procedure. However, the invasive methods
60 provide a high degree of accuracy [11-14]. The disadvantage of this method is the
61 patient's discomfort during the process of taking blood samples using either an injection
62 or a glucometer with the finger prick procedure [15]. This process can also cause various
63 infections because repeated checks need to be done regarding BGL values throughout
64 the day. On the other side, even though the device's price is relatively affordable, the
65 expense for disposable consumables causes financial burdens for the user [16].
66 Non-invasive BGL monitoring methods can answer the shortcomings of invasive
67 methods [17, 18]. Research on non-invasive BGL measurement has been started since
68 1974 [19]. The non-invasive method of BGL monitoring is an approach that becomes one
69 of the strategies in dealing with the limitations of invasive methods. Currently, non-
70 invasive methods are developing rapidly. The system that is painless, easy to operate,
71 and low-cost, will improve patient compliance with routine blood glucose monitoring.
72 Non-invasive methods to monitor or predict BGL have been developed using
73 different techniques. The preferred approach is based on sensor and optical techniques
74 [20]. Sensor-based measurements include photonic crystal glucose sensing [21], on-chip
75 electrochemical, continuous glucose monitoring, and metabolic heat confirmation
76 methods. Meanwhile, optical technical-based measurement is mid-infrared
77 spectroscopy, fluorescence, near-infrared spectroscopy, time of flight, Raman
78 spectroscopy, scattering changes techniques, and photoacoustic technique [12, 17, 22].
79 However, these non-invasive methods are still in the early stage of development. These
80 methods have not been designed optimally as wearable devices and are not ready for
81 home care.
82 Various researches with new methods emerged, this requires updating information
83 continuously. One of them is using Photoplethysmography (PPG), as shown in figure 1.
84

85

86
87 Figure 1.Blood Glucose Measurement Techniques [17].
6 Information 2021, 12, x FOR PEER REVIEW 3 of 14
7

88

89 Figure 2. PPG Measurement Techniques.


90
91 Photoplethysmography is a non-invasive optical technique to measure changes in blood
92 volume based on variations in the intensity of light that passes through or is reflected by
93 human organs [23]. The PPG signal is collected from a pulse oximeter which is
94 commonly used in clinical for monitoring oxygen saturation non-invasively [23].
95 Previous research has shown that the PPG signal can be explored to assess vital signs
96 such as blood pressure, cardiac output, respiration, and identify diabetes [25].
97 The PPG signal collected from pulse oximetry uses a photodiode with 700-1500 nm
98 [33] as an optical sensor to capture light from an Infrared LED with 940 nm, as shown in
99 figure 2. The signal generated by the photodiode is a signal with a small frequency
100 mixed with noise. Furthermore, this signal will be processed in a series using a signal
101 conditioning circuit consisting of a high pass filter, an op-amp amplifier, a low pass filter
102 Many academics are also developing PPG signal prototypes with optical sensor-
103 based data acquisition systems. This is supported by the development of embedded
104 system technology and relatively affordable sensor prices [32]. It is an easy and
105 inexpensive optical technique measurement method that could be used to monitor BGL
106 in diabetes [24-26].
107 PPG signal contains a lot of information regarding blood. There are still many
108 things that have not been explored from PPG morphology. PPG morphology shows a
109 different pattern between healthy and diabetic subjects of the same age in three different
110 age ranges [36, 37], as shown in figure 3. People with diabetes usually have a bell-shaped
111 PPG signal wave without a secondary peak, whereas healthy subjects have a slightly
112 prolonged shape in the diastolic phase. The difference in PPG signal curve can be
113 modeled using PPG wave sizes such as area, slope, and interval[38]. Variations in the
114 derivative of the PPG signal may be affected by arterial rigidity caused by diabetes [39].
115 However, there are several potential sources of error in estimation of BGL based on
116 PPG method, including:
117

118

119
120 Figure 3. The morphology of PPG in healthy and diabetic subjects (male) at 3 age group.
121
8 Information 2021, 12, x FOR PEER REVIEW 4 of 14
9

122 a. The PPG wavefrom are very sensitive and easily affected by motion, which
123 causes errors in measurement [21-25]. Most of the artifacts are caused by
124 movement relative to the skin [26].
125 b. The quality of the PPG waveform is easily damaged by poor blood
126 circulation.
127 c. The characteristics of the PPG waveform vary according to fluctuations in
128 peripheral vascular resistance, vascular wall elasticity, and blood viscosity.
129 d. PPG waveforms are easily affected; as a result, the relationship between
130 peripheral pulses and BGL will be less than optimal.
131 Most previous studies have attended to the estimation of BGL value. With these
132 methods, medical supervision is still needed [29,30]. Classification of non-diabetic and
133 diabetic helps patients in self-monitoring independently for detection diabetes. Users
134 can instantaneously know their blood glucose condition to provide an early warning
135 system for potential patients.
136 At the same time, advances in digital technology, big data, and artificial intelligence
137 have brought changes in various fields, including healthcare. This technology can
138 significantly improve the detection and early treatment of diabetes complications.
139 Artificial Intelligence (AI) can be utilized by using a machine learning (ML) or deep
140 learning (DL) approach as shown in table 1. The combination PPG with AI makes it
141 possible to implement predicting BGL
142 In recent years, deep learning methods have presented their outstanding
143 performance in pattern recognition applications [32]. However, there are two main
144 problems in deep learning classification methods: the classification accuracy and time
145 consumption during training. We attempt to address these limitations and propose a
146 method for the classification of BGL using the machine learning classification methods
147 based on PPG. The proposed method is suitable for real-time BGL classification. Our
148 main contributions are as follows:
149 a. We focus on a BGL classification based on PPG signal. Therefore, in this
150 study, three BGL classification levels were established. With our proposed
151 method, users can immediately know the condition of their blood BGL.
152 Accordingly, this method can expedite the treatment process.
153 b. With our proposed method, a special process is not needed to warranty the
154 PPG signal’s quality and excludes the need for a calibration process.
155 c. Our proposed method uses machine learning instead of deep learning to
156 achieve a faster training time. The common problem of deep learning is that
157 the training stage is too long.
158 d. The mapping of extracted PPG features using a machine-learning algorithm
159 shows promising results [33].
160 In this article, we proposed non-invasive classification of blood glucose level for
161 early detection diabetes using subspace k-nearest neighbours based on
162 photoplethysmography signal. Using photoplethysmography data that collected by near
163 infrared sensors, users can monitor their classification blood sugar continuously and in
164 real time.
165 This paper is organized as follows: section 2 describes the materials and methods;
166 the experimental results are explained in section 3; section 4 discusses the results; and
167 finally, section 5 presents the conclusions of study.

168 2. Materials and Methods

169 2.1. Data Acquisition


170 We collected the original PPG signals obtained from students of the Indonesian
171 Defense University and patients from Wocare Diabetes Clinic. The collection of data sets
172 is carried out to obtain information about the basic physiology of the individual, which
173 also collects the PPG waveform signal, detects blood pressure, blood oxygen saturation
174 levels,
10 Information 2021, 12, x FOR PEER REVIEW 5 of 14
11

175
176 Table 1. Classification methods comparison.
177
Year Number PPG Signal Invasive Classifier Features Evaluation Characteristic Ref
of Methods Extraction Metric
Subjects
Auto-Regressive
Prototype PPG Glucose Not Specificity=0.91
2009 140 Moving Average Classification [42]
Signal meter Mentioned Sensitivity=1
(ARMA)

Random Forest 69 ROC=0.686 Classification

Gradient Boosting 69 ROC=0.693 Classification


2017 1157 Pulse Oximeter HbA1c Test [27]
Linear Discriminant
69 ROC=0.551 Classification
Analysis

Single Pulse Analysis 8 R2 =0.91 Classification


Prototype PPG Glucose
2019 611 Time and Frequency [32]
Signal meter 8 R2 =0.84 Classification
Domain

Prototype PPG Glucose Convolutional Neural Auto


2019 30 MSE error=0.15 Regression [43]
Signal meter Network Extraction

Subspace KNN 67 Accuracy=86.2%. Classification

Prototype PPG Glucose RUS Boasted Trees 67 Accuracy=85.0% Classification


2019 50 [30]
Signal meter Bagged Trees 67 Accuracy=86.0% Classification

Decision Trees 67 Accuracy=80.1% Classification

2019 136 Prototype Toe Not Support Vector [38]


37 Accuracy=97.87% Classification
PPG Mentioned Machine

Not
2020 459 Fingertip PPG HbA1c Test Logistic Regression Accuracy=92.3% Classification [44]
Mention

Gaussian Support
Accuracy
Vector Machine 28 Classification
=81.49%
(GSVM)
Smartphone Glucose
2020 80 Not [45]
PPG Signal meter Bagged Trees Accuracy=74% Classification
Mentioned

Not
K-Nearest Neighbor Accuracy=71% Classification
Mentioned

Fine Gaussian Support


mARD=7.62
Vector Regression 6 Regression
RMSE=11.20
(FGSVR)

Support Vector mARD=21.10


Prototype PPG Glucose 6 Regression
2020 200 Regression Quadratic RMSE=42.90 [33]
Signal meter
mARD=13.22
Linear Regression 6 Regression
RMSE=23.35

mARD=9.67
Enabled Boosted Trees 6 Regression
RMSE=13.00
178 Gambar table 2 ( diabetes range)
179
12 Information 2021, 12, x FOR PEER REVIEW 6 of 14
13

180
181
182
183
184
185
186
187
188 heart rate, and BGL values at the same time. The data of these parameters were taken as
189 much as 3 times the measurement on each subject. The dataset includes PPG and BGL
190 information from subjects divided according to BGL values in two classifications,
191 namely normal and diabetic. The total duration of the experiment was approximately 15
192 minutes. The data collected from PPG and BGL signals takes about 5 minutes. Each data
193 segment consists of 280 sampling points. We divide the data into groups of signals and
194 labels. The signal is an array of cells consisting of a collection of PPG signals while the
195 label is a description of normal or diabetes based on the BGL value. as shown in table 2.
196 Waveforms of the PPG signals are shown in Figure 4.
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213

214

215

216 Figure 4. Two different BGL classifications. Each segment consists of 280 sampling points.

217 After all the data is collected, we divide the data into two signal sets, namely into a
218 training set and a test set. A training set to train the classifier and a test set to test the
219 accuracy of the classifier. To prevent bias, datasets were added by duplicating signal
220 data from each classification level until each group had the same number of datasets
221 (150 normal subjects and 150 diabetic subjects). In this study, a confusion matrix was
222 used to visualize the performance of classifiers for data sets where the true value is
223 known. To comprehensively evaluate the test model, various evaluation indices were
224 used, including
225
226
227
14 Information 2021, 12, x FOR PEER REVIEW 7 of 14
15

228
229
230
231
232
233

234 Figure 5. PPG waveform characteristics.


235
236 accuracy (Ac), recall (Re), specificity (Sp), precision (Pr), sensitivity (Se), and F1 score In
237 an effort to anticipate the low quality of the PPG signal, it is necessary to select a signal
238 based on its waveform. Each PPG signal segment is evaluated by a classification
239 threshold as unfit, acceptable, or excellent PPG waveform to determine whether it
240 should be stored, as described in Figure 5. [35]. This measure was developed to reduce
241 PPG segments with high noise and motion artifacts. 1
2
242
243 2.2. Design of Hardware
244

3
4
5

&ĞĂƚƵƌĞ ůĂƐƐŝĨŝĐĂƚŝŽŶ
džƚƌĂĐƚŝŽŶ ZĞƐƵůƚ

6
245 7

246
247 Figure 6. Design monitoring BGL system.
248
249 The following circuit shows the ON/OFF control scheme for the infrared light
250 source on the HRM-2511E. The new signal must be pulled high to display the IR LED.
251 The output of the photodetector (VSENSOR) contains the PPG signal that enters the two-
252 stage filter and the processing circuit for further processing.
253
16 Information 2021, 12, x FOR PEER REVIEW 8 of 14
17

254

255
256 Figure 7. Circuit sensor HRM2511E.
257
258 The PPG signal from the photodetector is weak and noisy, so it requires an
259 amplifier and filter circuit to boost and clear the signal. In Phase I instrumentation, the
260 signal is first passed through a passive high-pass filter (HPF) (RC) to block the DC
261 component of the PPG signal. The HPF cut-off frequency is 0.5Hz, and is set by the
262 values of R (=68K) and C (=4.7uF). The output from the HPF goes to an active low-pass
263 filter (LPF) based on the Op amp. The op amp operates in non-inverting mode and has
264 gain and cut-off frequencies set to 48 and 3.4Hz, respectively.
265 To achieve full swing of the PPG signal at the output, the negative input of the Op
266 amp is tied to a 2.0V reference voltage (V-ref). V-ref is generated using a zener diode. At
267 the output is a potentiometer (P1) which acts as a manual gain control. The output of the
268 active LPF now goes to the Phase II instrumentation circuit, which is essentially a replica
269 of the Phase I circuit. Note that the amplitude of the signal going to the second stage is
270 controlled by P1. The op amp used in this project is the MCP6004 from Microchip, which
271 is a Quad Op amp device.

272

273
274 Figure 8. Stage 1, filtering and amplification.
275
276 The second stage also consists of similar HPF and LPF circuits. The filtered signal is
277 fed to a third op amp, which is configured as a non-inverting buffer with unity gain. The
278 output of the buffer provides the required analog PPG signal. Potentiometer P1 can be
279 used to control the PPG signal that appears in the output stage buffer. The fourth op
280 amp in the MCP6004 is used as a voltage comparator. The analog PPG signal is fed to
281 the positive input and the negative input is tied to the reference voltage (VR). The
282 magnitude of VR can be adjusted between 0 and Vcc via potentiometer P2 (shown
283 below). Whenever the PPG pulse waveform exceeds the VR threshold, the comparator
284 output goes high. Thus, this setting provides a digital pulse output that is in sync with
285 the heart rate. In this condition the pulse width is also determined by VR.
18 Information 2021, 12, x FOR PEER REVIEW 9 of 14
19

286
287

288

289
290 Figure 9. Stage 2, instrumentation circuit.
291
292 The Easy Pulse sensor makes it possible to measure the pulse from the fingertips
293 using the PPG transmission mode. Easy Pulse Version 1.1 uses the HRM-2511-E sensor
294 that fits comfortably at the fingertips. Inside the sensor is an IR LED that is fingertip
295 from one side. A photodetector placed on the opposite side and facing towards the IR
296 LED detects the light transmitted through the finger. Small variations in the intensity of
297 the transmitted light are in sync with changes in blood volume and therefore with the
298 pumping action of the heart. The on-board electronics generate the PPG signal and
299 amplify the signal so that it can be read by the microcontroller.

300
301
302 Figure 4. Digital pulse output circuit.
303
304 3. Results
305 We experimented with MATLAB (R2019a version) to classify BP based on PPG
306 signals. In this study, the dataset was divided into a training set and a testing set. We
307 collected data from the PPG–BP figshare database [34] and are available as a MATLAB
308 file extension in Supplementary Materials. The analysis of the PPG features was
309 conducted. Each PPG signal was extracted into 2100 sample points. Feature extraction
310 was carried out point by point so physiological data contained in PPG signals can be
311 explored optimally. It also makes the number of sample points used the largest and most
312 detailed compared to previous studies.
313 In this study, before deciding which model to use, a comparative analysis was
314 conducted with other models (linier discriminant, decision tree, discriminant analysis,
315 support vector machine, K-nearest neighbor, bagged trees, and deep learning RNN (long
316 short-term memory)) for the same dataset. The dataset was divided into a training set
317 (870 subjects) and a testing set (30 subjects). We compared the testing performance based
20 Information 2021, 12, x FOR PEER REVIEW 10 of 14
21

318 on accuracy value. The results indicate that KNN algorithm achieved better testing
319 performance than the other classification methods, as shown in Table 3. In this study, a
320 confusion matrix was used to visualize classifier performance for a dataset where the
321 true values are known. The axis labels are the class labels hyper (HT), normal (NT), and
322 prehypertension (PHT). The output class represents the label assigned to the signal by
323 the network.
324 Table 3. Comparison of testing performance
325
326 Classifier Accuracy
327 Support vector machine 73.6%
328 Decision tree 80.0%
329 Discriminant analysis 80.0%
Bagged trees 80.0%
330
Long short-term memory 80.0%
331 K-nearest neighbor 86.7%
332
333
334 The target class represents the ground-truth label of the signal. The green cells
335 represent true positive (TP) or true negative (TN) signals. The confusion matrix from the
336 testing process of each model is shown in Figure 8. Based on the results of tests between
337 models, KNN achieved the best results; therefore, this study used KNN as a classifier.
338 In this proposed KNN model, there are two main parameters: the number of
339 neighbors (K) and the accuracy value that needs to be analyzed. To evaluate these
340 parameters, series of contrast experiments with di↵erent training parameter sets were
341 conducted. We tested the contrast experiments with a di↵erent number of neighbors to
342 obtain the best accuracy value. When keeping the values of the distance metric
343 (Euclidean), distance weight (equal), and standardized data (true) unchanged, the
344 detailed parameter set is shown in Table 2. The results indicate that KNN algorithm with
345 K value = 1 achieved better training accuracy than the other number of K. The scanter
346 plot can help for investigate features to include or exclude. We can visualize training
347 data and misclassified points on the scatter plot. The scatter plots of a training set with
348 di↵erent numbers of neighbors are shown in Figure 9.
349 The dataset was divided into a training set (779 subjects) and a testing set (121
350 subjects). The confusion matrix from the testing process of the KNN algorithm is shown
351 in Figure 10. The confusion matrix of the testing process shows that 74.30% of the
352 ground-truth hyper signals are correctly classified as hyper (HT), 100% of the ground-
353 truth normal signals are correctly classified as normal (NT), and 82.50% of the ground-
354 truth prehyper signals are correctly classified as prehyper (PHT). The above six formulas
355 were computed by the true positive (TP), false positive (FP), true negative (TN), and
356 false negative (FN) quantities. Table 3 shows the classification performance of our
357 proposed method (KNN algorithm). The F1 scores of these three classification trials were
358 100%, 100%, and 90.80%, respectively.
359 We performed a comparative study between our method and the results of
360 previous studies [31,33]. To compare BP classifications based on a PPG signal, three
361 classification experiments were carried out: NT (46 subjects) versus PHT (41 subjects),
362 NT (46 subjects) versus HT (34 subjects), and HT (34 subjects) versus PHT + NT (7
363 subjects). Table 4 presents a performance comparison with earlier studies.
364

365 4. Discussion
366 Our proposed method uses KNN (machine learning) instead of deep learning to
367 achieve faster training times. KNN does not use training data to perform any
368 generalization. In KNN, there is no explicit training phase, or it is very minimal. This
369 also means that the training phase is fast. Lack of generalization means that KNN keeps
370 all the training data. To be more exact, all the training data are needed during the testing
371 phase. We chose KNN as a classifier over other classifiers in the machine learning group
22 Information 2021, 12, x FOR PEER REVIEW 11 of 14
23

372 because KNN does not require assumptions about data. This situation is suitable for
373 application to nonlinear data such as PPG signals. KNN stores the training dataset and
374 learns from it only at the time of making real-time predictions. This makes the KNN
375 algorithm much faster than other machine learning methods that require training, for
376 example support vector machine (SVM) and linear regression. Since the KNN algorithm
377 requires no training before making predictions, new data can be added seamlessly,
378 which will not impact the accuracy of the algorithm. A disadvantage associated with
379 KNN is that we need to do feature scaling (standardization and normalization) before
380 applying the KNN algorithm to any dataset. Each PPG signal has been extracted into
381 2100 features. Feature extraction was carried out point-by-point so the physiological
382 data contained in PPG signals can be explored optimally. It also makes the number of
383 features used the largest and most detailed compared to previous studies.
384 The training error rate and the validation error rate are two parameters we needed
385 to access with di↵erent K values. In this study, we made comparisons with several K
386 values, and it was found that K = 1 had the lowest error rate with the highest accuracy
387 value. In Figure 11, the error rate at K = 1 is always zero for the training sample. This is
388 because the closest point to any training data point is itself. Hence, the prediction is
389 always accurate with K = 1.
390 We performed a comparative study between our method and the results of
391 previous studies [26,33]. The first study by Liang. Y et al. [26] used the PTT-middle to
392 represent the pulse arrival time (PAT) feature, as shown in Figure 12. PAT has some
393 limitations as it cannot classify these three categories of blood pressure levels.
394 Additionally, the combined feature set of the PAT feature and 10 PPG features achieves
395 higher accuracy than other models. The study employed four distinctive classifiers: a
396 bagged tree, K-nearest neighbors (KNN), logistic regression, and an AdaBoost tree. The
397 KNN classifier presented the best performance compared with the other models in the
398 first study by Liang. Y et al. [26], as shown in Table 4. The F1 scores of these three
399 classification trials (NT (46 subjects) vs. PHT (41 subjects), NT (46 subjects) vs. HT (34
400 subjects), and NT + PHT (87 subjects) vs. HT (34 subjects)) were 83.34%, 94.84%, and
401 88.49%, respectively. Table 4 shows that the F1 scores of our proposed KNN method
402 were higher than KNN with Liang. Y et al.’s [26] method. The accurate identification of
403 feature points is very important, especially based on the PPG morphology method, and
404 the PPG sampling frequency is the key. In our study, the PPG signal was collected as
405 1000 Hz sample frequency, whereas the sampling frequency of Liang. Y et al. [26]
406 method is only 125 Hz in the MIMIC database, which could lead to the identification
407 error of each characteristic point. The number of features in the extraction feature greatly
408 a↵ects the level of accuracy of the qualifications. Our study used 2100 PPG features
409 points, whereas Liang. Y et al. [26] used only 10 PPG features. Our method is simpler
410 because it only uses one input signal, i.e. PPG, while Liang. Y et al. [26] used two input
411 signals, namely ECG and PPG, as shown in Table 5.
412 In the second study of Liang. Y et al. [33], using a continuous wavelet transform
413 (scalogram) and CNNs deep learning for BP classification, the training, unfortunately,
414 took a very long time. They used a training set containing 2323 images, which took
415 about 350 min for training. While our proposed method using a training set of 779
416 images required a training time of only about 74.116 s. In this case, because the training
417 set was large, the training process could take several minutes. When a network uses data
418 with a large range of values and a large average, the learning process and convergence
419 of the network can be slow [38]. They employed a continuous wavelet transform
420 (Scalogram) and CNNs. The F1 scores of these three classification trials (NT (46 subjects)
421 vs. PHT (41 subjects), NT (46 subjects) vs. HT (34 subjects), and NT + PHT (87 subjects)
422 vs. HT (41 subjects)) were 80.52%, 92.55%, and 82.95%, respectively. Table 4 shows that
423 the F1 scores of our proposed method (KNN) were higher than those of the CNN
424 classifier and regression methods, such as the bagged tree, logistic regression, and
425 AdaBoost tree methods. This result indicates that our proposed method achieved higher
426 accuracy than the CNNs, propagation, and regression methods.
427
24 Information 2021, 12, x FOR PEER REVIEW 12 of 14
25

428 5. Conclusions
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473 This section is not mandatory but can be added to the manuscript if the discussion
474 is unusually long or complex.

475 6. Patents
476 This section is not mandatory but may be added if there are patents resulting from
477 the work reported in this manuscript.

478 Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Figure
479 S1: title, Table S1: title, Video S1: title.
26 Information 2021, 12, x FOR PEER REVIEW 13 of 14
27

480 Author Contributions: For research articles with several authors, a short paragraph specifying
481 their individual contributions must be provided. The following statements should be used
482 “Conceptualization, X.X. and Y.Y.; methodology, X.X.; software, X.X.; validation, X.X., Y.Y. and
483 Z.Z.; formal analysis, X.X.; investigation, X.X.; resources, X.X.; data curation, X.X.; writing—
484 original draft preparation, X.X.; writing—review and editing, X.X.; visualization, X.X.; supervision,
485 X.X.; project administration, X.X.; funding acquisition, Y.Y. All authors have read and agreed to
486 the published version of the manuscript.” Please turn to the CRediT taxonomy for the term
487 explanation. Authorship must be limited to those who have contributed substantially to the work
488 reported.
489 Funding: Please add: “This research received no external funding” or “This research was funded
490 by NAME OF FUNDER, grant number XXX” and “The APC was funded by XXX”. Check carefully
491 that the details given are accurate and use the standard spelling of funding agency names at
492 https://search.crossref.org/funding. Any errors may affect your future funding.
493 Data Availability Statement: In this section, please provide details regarding where data
494 supporting reported results can be found, including links to publicly archived datasets analyzed
495 or generated during the study. Please refer to suggested Data Availability Statements in section
496 “MDPI Research Data Policies” at https://www.mdpi.com/ethics. You might choose to exclude this
497 statement if the study did not report any data.
498 Acknowledgments: In this section, you can acknowledge any support given which is not covered
499 by the author contribution or funding sections. This may include administrative and technical
500 support, or donations in kind (e.g., materials used for experiments).
501 Conflicts of Interest: Declare conflicts of interest or state “The authors declare no conflict of
502 interest.” Authors must identify and declare any personal circumstances or interest that may be
503 perceived as inappropriately influencing the representation or interpretation of reported research
504 results. Any role of the funders in the design of the study; in the collection, analyses or
505 interpretation of data; in the writing of the manuscript, or in the decision to publish the results
506 must be declared in this section. If there is no role, please state “The funders had no role in the
507 design of the study; in the collection, analyses, or interpretation of data; in the writing of the
508 manuscript, or in the decision to publish the results”.

509 Appendix A
510 The appendix is an optional section that can contain details and data supplemental
511 to the main text—for example, explanations of experimental details that would disrupt
512 the flow of the main text but nonetheless remain crucial to understanding and
513 reproducing the research shown; figures of replicates for experiments of which
514 representative data is shown in the main text can be added here if brief, or as
515 Supplementary data. Mathematical proofs of results not central to the paper can be
516 added as an appendix.

517 Appendix B
518 All appendix sections must be cited in the main text. In the appendices, Figures,
519 Tables, etc. should be labeled starting with “A”—e.g., Figure A1, Figure A2, etc.

520 References
521 References must be numbered in order of appearance in the text (including citations in tables and legends) and listed
522 individually at the end of the manuscript. We recommend preparing the references with a bibliography software package,
523 such as EndNote, ReferenceManager or Zotero to avoid typing mistakes and duplicated references. Include the digital object
524 identifier (DOI) for all references where available.
525
526 Citations and references in the Supplementary Materials are permitted provided that they also appear in the reference list
527 here.
528
529 In the text, reference numbers should be placed in square brackets [ ] and placed before the punctuation; for example [1], [1–3]
530 or [1,3]. For embedded citations in the text with pagination, use both parentheses and brackets to indicate the reference
531 number and page numbers; for example [5] (p. 10), or [6] (pp. 101–105).
532
533 1. Author 1, A.B.; Author 2, C.D. Title of the article. Abbreviated Journal Name Year, Volume, page range.
28 Information 2021, 12, x FOR PEER REVIEW 14 of 14
29

534 2. Author 1, A.; Author 2, B. Title of the chapter. In Book Title, 2nd ed.; Editor 1, A., Editor 2, B., Eds.; Publisher: Publisher
535 Location, Country, 2007; Volume 3, pp. 154–196.
536 3. Author 1, A.; Author 2, B. Book Title, 3rd ed.; Publisher: Publisher Location, Country, 2008; pp. 154–196.
537 1. Author 1, A.B.; Author 2, C. Title of Unpublished Work. Abbreviated Journal Name stage of publication (under review;
538 accepted; in press).
539 2. Author 1, A.B. (University, City, State, Country); Author 2, C. (Institute, City, State, Country). Personal communication, 2012.
540 3. Author 1, A.B.; Author 2, C.D.; Author 3, E.F. Title of Presentation. In Title of the Collected Work (if available), Proceedings of
541 the Name of the Conference, Location of Conference, Country, Date of Conference; Editor 1, Editor 2, Eds. (if available);
542 Publisher: City, Country, Year (if available); Abstract Number (optional), Pagination (optional).
543 4. Author 1, A.B. Title of Thesis. Level of Thesis, Degree-Granting University, Location of University, Date of Completion.
544 5. Title of Site. Available online: URL (accessed on Day Month Year).
545 6.
546 [1] K. I. Galaviz and e. all, "Lifestyle and the Prevention of Type 2 Diabetes: A Status Report," Am J Lifestyle Med, vol. 12, no. 1,
547 pp. 4-20, Jan-Feb 2018, doi: 10.1177/1559827615619159.
548 [2] D. Castaneda, A. Esparza, M. Ghamari, C. Soltanpur, and H. Nazeran, "A review on wearable photoplethysmography
549 sensors and their potential future applications in health care," Int J Biosens Bioelectron, vol. 4, no. 4, pp. 195-202, 2018, doi:
550 10.15406/ijbsbe.2018.04.00125.
551 [3] Y. K. Qawqzeh, "Neural Network-based Diabetic Type II High-Risk Prediction using Photoplethysmogram Waveform
552 Analysis," International Journal of Advanced Computer Science and Applications, vol. 10, no. 12, 2019, doi:
553 10.14569/ijacsa.2019.0101212.
554 [4] E. M. Moreno et al., "Type 2 Diabetes Screening Test by Means of a Pulse Oximeter," IEEE Trans Biomed Eng, vol. 64, no. 2,
555 pp. 341-351, Feb 2017, doi: 10.1109/TBME.2016.2554661.
556 [5] S. Masaoka, A. LevRan, L. R. Hill, G. Vakil, and E. H. G. Hon, "Heart Rate Variability in Diabetes: Relationship to Age and
557 Duration of the Disease," https://care.diabetesjournals.org/, vol. 8, no. 1, p. 4, 1985.
558 [6] N. Nirala, R. Periyasamy, B. K. Singh, and A. Kumar, "Detection of type-2 diabetes using characteristics of toe
559 photoplethysmogram by applying support vector machine," Biocybernetics and Biomedical Engineering, vol. 39, no. 1, pp. 38-
560 51, 2019, doi: 10.1016/j.bbe.2018.09.007.
561 [7] Q. Yousef, M. B. I. Reaz, and M. A. M. Ali, "The Analysis of PPG Morphology: Investigating the Effects of Aging on
562 Arterial Compliance," Measurement Science Review, vol. 12, no. 6, 2012, doi: 10.2478/v10048-012-0036-3.
563 7.

You might also like