You are on page 1of 1
S. Agar t lf Computers in ily and Meine 6 (205) 132-142 ns 213, AF classification ‘Our goal is to classify each data segment into AF or non-AF category using 4L extracted features. For this purpose, we ‘employ support vector machine (SVM) as our classification algo rithm, SVM is a non-parametric binary classifier which has shown promising results in various medical diagnostics [32-35]. Ina conventional problem of binary classification, a data point is viewed as a p-dimensional vector which belongs to one the (wo possible categories. An SVM classifies these data points by finding the best (p- 1)-dimensional hyperplane that separates all data points of ‘one class from those of the other clas, The separating hyperplane has the largest margin between the two classes (Margin is the ‘maximal width of the slab parallel to the hyperplane that has no imerior data points). SVM then classifies new samples based on ‘hich side ofthe hyperplane they fall into and how far the samples are from the Fyperplane (SVM score)(35|, One of the major advan- tages of SVM is that in addition to Iinear classification, it can efficiently perform a non-linear clasifation |37), Using a kemel trick, samples are mapped into higher-dimensional feature space ‘where a linear classifier can separate the two classes with the largest ‘margin among the samples. Mapping the separating hyperplane back to the original space results in having a non-linear classifies For our AF detection, we use a common kernel choice, Gaussian Radial Basis Function kernel Ko, d) 20S, 6 with scaling factor o=1. @ and @; are features of two data Segments and o—d] is the squares Euclidean distance between the two feature vectors. The separating hyperplane for ‘our SVM classifier is obtained using the conventional “Least- Squares” method. 22. Patient data “To evaluate the performance of the proposed method, we ‘employ MIF-BIH Atrial fibrillation (MIF-BIH AFIB) dataset (38, the most popular and most frequently used publicly available ‘dataset fot AF detection. This dataset contains 23 annotated records of ECG signal from atrial fbrillation patients (mostly paroxysmal) with sampling frequency of 250 Hz and 12-bit resolu- tion over a range of ++ 10 mv. Each record is about 10 h-long and the whole dataset includes slightly less than 234 h of data. The ‘dataset has 605 annotated episodes: 291 atrial fibrillation episodes with average time duration of 115 s, 14 arial futer episodes with average time duration of 419 s, 12 episodes of junctional rhythm with average time duration of 27's and 288 episodes of all other sythms with average time duration of 174. Fig. 3 presents examples of ECG signal during four annotated episodes. 23. Data analysis andl validation protocol Daubechies 5 is an orthogonal wavelet resembling the ECG waveform in morphology [25]. Hence, we choose the Daubechies 5 wavelet as the mother wavelet to perform wavelet analysis, Given the sampling frequency of 250 Hz and the frequency range of atrial activities (4-9 Hz) [39], a L=6 level wavelet transform is implemented. A periodogram spectral estimator with a Hamming ‘window is used to obtain the power spectrum of each wavelet coefficient, ‘The origina beat-to-beat annotations of data are converted to a T-second resolution (T is the duration of each data segment) by using a minimum percentage parameter P [18,20|: A data segment ‘classified asa true AF ony ifthe percentage of annotated AF beats Jn that data segment is more than P. A two-fold stratified cross- validation on the feature vectors (extracted from the dataset) is employed to train and test the classifier. At the testing phase ofeach fold, SVMs scares for the testing samples (distance of the testing samples to the separating hyperplane) are obtained and compared with a pre-set threshold ¢ to clasify corresponding testing samples, ‘The classification results ({rom each fold) is then compared to the ground truth (converted annotation) and number of True Positive (TP) False Negative (FN), True Negative (TN) and False Positive (FP) cases are calculated. Finally, the results from the two folds are ‘combined to calculate True Positive Rate (TPR) as TPR = TP/TP FN, False Positive Rate (FPR) as FPR = FP/FP-+TN and Accuracy (ACC) as ACC =TP-+TN/TP+-TN+EP+EN, In order to obtain the optimal values of the parameters of the proposed method (T. P and ¢), an exhaustive standard Receiver Operating Characteristic (ROC) Curve analysis is implemented as the following: Parameter T (length of the data segment in seconds) Js varied over a reasonable range of 10-120 with incremental steps of 5s, Parameter P (percentage threshold for annatation of AF data segments) is varied from 0% to 100% with incremental steps of 10%. Then for each specific value of T and P, an ROC curve 's derived by varying the score threshold ¢ from — 15 t0 15 with ‘incremental steps of 0.01. The ROC with the highest Area Under the Curve (AUC) is used to obtain the optimal values of parameters A x0 5 oo I: ttn “ie oso) ‘Tee sone) D gg g ie encont) Fig. 3. Examples of FCG sige dving four annua epsodes (A) Atal Sein; (ara ue; (© anetonal shyt: and (0) ater yen,

You might also like