Professional Documents
Culture Documents
Paper 3
Paper 3
Abstract— Polycystic Ovary Syndrome (PCOS) is a medical Despite the presence of diagnostic criteria, there are
condition affecting the female’s reproductive system causing various clinical uncertainties and complexities when it comes
ano/oligoovulation, hyperandrogenism, and/or polycystic to the diagnosis of PCOS. To begin with, very few research
ovaries. Due to the complexities in diagnosing this disorder, it has been done on the diagnosis of PCOS given the ambiguity
was of upmost importance to find a solution to assist physicians of its symptoms and their broad spectrum of severity. To
with this process. Therefore, in this study, we investigated the exemplify, it has been reported that PCOS has multiple
possibility of building a model that aims to automate the heterogeneous presentations that vary according to age and
diagnosis of PCOS using Machine Learning (ML) algorithms race, and can be indicative of multiple other comorbidities.
and techniques. In this context, a dataset that consisted of 39
Although physicians reported that they are familiar with the
features ranging from metabolic, imaging, to hormonal and
biochemical parameters for 541 subjects was used. First, we
most common comorbidities, such as obesity, diabetes, insulin
applied pre-processing on the data. Hereafter, a hybrid feature resistance, and depression, they are less likely to associate
selection approach was implemented to reduce the number of between PCOS and conditions such as sleep apnea, fatty liver,
features using filters and wrappers. Different classification endometrial cancer, gestational diabetes, and anxiety.
algorithms were then trained and evaluated. Based on a Therefore, due to its diverse presentation, diagnosing PCOS
thorough analysis, the Support Vector Machine with a Linear may need the evaluation of different physicians. However, the
kernel (Linear SVM) was chosen, as it performed best among lack of coordination between doctors might result in the lack
the others in terms of precision (93.665%) as well as high of a comprehensive assessment, which is needed to diagnose
accuracy (91.6%) and recall (80.6%). PCOS [6].
Keywords—PCOS, Machine Learning, Disease Detection, Furthermore, PCOS usually goes underdiagnosed by
Classification many healthcare providers. According to a community-based
study, 18% of women had PCOS out of which 70% remain
I. INTRODUCTION underdiagnosed. Moreover, current literature suggests that the
Polycystic ovary syndrome (PCOS) is a complex under diagnosis of PCOS is exacerbated, especially in
heterogeneous endocrine disorder that affects the female’s adolescents. Primary indicators of PCOS, such as anovulation,
reproductive system [1]. PCOS usually affects 12 to 21% of cyst development, presence of multiple follicles, hormonal
reproductive-age women, out of which 70% remain fluctuations, and acne, are present during puberty as well.
undiagnosed [2]. This syndrome often causes menstrual Hence, physicians tend to be less concerned about these
abnormalities, hyperandrogenism, fertility problems, and symptoms at that age [6]. Other concerns were raised
metabolic sequelae. It is characterized by features that include regarding the over diagnosis of PCOS as well; as some
irregular menstrual cycles, acne, and hirsutism [3]. clinicians expand the diagnostic criteria of PCOS to include
Additionally, PCOS typically involves hormonal imbalances, polycystic ovaries [6].
insulin resistance, and metabolic abnormalities, which
significantly increase the risk of infertility, type 2 diabetes, Due to all above complexities, inconsistent approaches
and cardiovascular disease [4]. have been taken to diagnose women with PCOS. In fact, it has
been reported that women were dissatisfied, frustrated, and
Diagnosing PCOS can be a challenging process given the confused with their diagnostic experience and had to wait an
ambiguity of the symptoms. According to the Endocrine average of 2 years to receive the correct diagnosis [7]. A study
Society, clinicians are advised to diagnose PCOS using the conducted on patient satisfaction with PCOS diagnosis
Rotterdam criteria, which requires the presence of at least two revealed that only 35.2% were satisfied with their experience,
of the following three medical conditions: hyperandrogenism, based on the time it took to diagnose and number of physicians
ovulatory dysfunction, and polycystic ovaries. Alongside seen before attaining a final diagnosis [6]. In fact, as many
these symptoms, a measure of the LH and FSH levels can be patients have to dedicate a lot of time to undergo several tests
beneficial in establishing an LH/FSH ratio. In fact, a ratio and visit several specialists to be taken seriously and get their
greater than 2 usually indicates PCOS [5]. symptoms checked out, many have resorted to self-diagnosis
and managing PCOS on their own [6].
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on March 20,2024 at 03:13:26 UTC from IEEE Xplore. Restrictions apply.
978-1-6654-3501-7/21/$31.00 ©2021 IEEE 208
2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on March 20,2024 at 03:13:26 UTC from IEEE Xplore. Restrictions apply.
209
2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)
the larger numeric features do not dominate those with After performing the ANOVA test on each feature, we
smaller values [12]. consulted an OB-GYN to make sure that the results obtained
In this work, we used Z-score normalization as it keeps were aligned with medical field standards. The physician
the statistical characteristics of the data. Feature records are recommended that we keep some features that the ANOVA
transformed using Equation (1) to have a mean equals to zero test labeled as insignificant since they are crucial for the
and variance equals to one [12]. The equation of Z-score diagnosis of PCOS (Table II). Therefore, in order to find a
normalization is given below, where feature i records balance between the features recommended by the physician
( ′ , ) are transformed into records ′ , . and those found significant by the ANOVA test, we decided
to filter out the features that were considered insignificant by
′, = ,
(1) both the doctor and the statistical test, which are blood group,
RR, FSH/LH, Vitamin D3, systolic blood pressure, and
Where and are the mean and standard deviation of diastolic blood pressure.
feature i, respectively.
TABLE II. FEATURE SIGNIFICANCE BASED ON ANOVA TEST AND
C. Feature Selection DOCTOR RECOMMENDATION
Feature selection (FS) is a crucial step to improve Feature Statically Doctor Recommendation
classifier’s performance and reduce computation time. FS Significant
consists of selecting the most effective or prominent features Cycle Length Yes Yes
from the entire set of features. When irrelevant or redundant Marriage Status Yes Yes (for delayed fertility)
features are removed, the performance of the classifier will be No of abortions No Yes (PCOS causes miscarriages)
boosted and good classification results will be achieved [13]. FSH Yes Yes
LH Yes Yes
In this study, we utilize a hybrid feature selection approach LH/FSH Yes Yes
to select the optimal feature subset. A typical hybrid approach FSH/LH No No
first applies filters to remove the least significant features and TSH No Yes (for differential diagnosis)
then applies wrappers to fine tune and further reduce the AMH Yes Yes
dimensionality of the dataset. Filters are models that depend PRL No Yes (for differential diagnosis)
on the training dataset rather than depending on the classifier Vitamin D3 No No
or the predictor. Since filters do not depend on the PRG No Yes (gives an idea on ovulation)
RBS No Yes (gives an idea on the
classification algorithm, they are considered simple, fast, and metabolism)
can easily scale to high dimensional datasets [13]. On the other
hand, wrappers select a subset of features based on the
performance of the classifier. Unlike the filter approach, 2) Sequential Forward Floating Feature Selection
wrappers consider the correlation between features and the Wrapper
way they interact with the Machine Learning algorithm [13]. The Sequential Forward Floating Selection (SFFS) is a
In our case, filters are applied first to reduce the number of wrapper that starts with an empty set of features, and
features, which makes it less computationally expensive when sequentially adds features that improve the performance of the
wrappers are applied [14]. classifier. The SFFS adjusts the trade-off between forward and
backward steps, dynamically, by applying after each forward
1) Filters step a number of backward steps as shown in Figure 2. This is
To perform feature selection on the continuous variables considered as ‘self-controlled backtracking’, where the
in our dataset, we applied the Analysis of Variance (ANOVA)
resulting feature subsets result in a better classifier
test. ANOVA is a statistical test that compares the means of
performance compared to the previously evaluated ones at that
several samples. The ANOVA test reveals which features
have great influence on the outcome or on other parameters level. Consequently, there are no backward steps at all, if the
[15]. result at that specific level does not yield better classifier
performance [16].
After performing the ANOVA test on each feature with
respect to our output (whether the patient suffers from PCOS
or not), the p-value was computed. If the p-value was less
than 0.05, the feature was considered significant and vice
versa. The results obtained are shown in the Table I.
Feature P-value
BMI 0.000012 Figure 2. Sequential Forward Floating Selection Algorithm
Cycle Length 0.000011
Marriage Status 0.003796 The SFFS wrapper was applied on the following classifiers:
FSH 0.005678
LH 0.01959 1) Logistic Regression (LR)
LH/FSH 0.00046 2) Decision Tree (DT)
AMH 0.000000001416478 3) Naïve Bayes (NB)
Left Follicle No. 1.900231E-51 4) Linear Support Vector Machine (LSVM)
Right Avg. Follicle Size 0.05398
Endometrium Thickness 0.02598
5) Polynomial Support Vector Machine (PSVM)
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on March 20,2024 at 03:13:26 UTC from IEEE Xplore. Restrictions apply.
210
2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)
6) Radial Basis Function Support Vector Machine After analyzing the results, it was concluded that the best
(RBF SVM) performing classifiers were the KNN, Linear SVM, and the
7) K-Nearest Neighbors (KNN) Polynomial SVM. However, the results of the t-test for these
8) AdaBoost (AD) three classifiers were inconclusive and the null hypothesis
9) Linear Discriminant Classifier (LD) could not be accepted or rejected.
10) Quadratic Discriminant Classifier (QD)
11) Random Forest (RF) Therefore, in order to narrow down which classifier
performed best, the following steps were taken.
In order to decide what evaluation metric to choose for
the wrapper, we contacted an OB-GYN to determine what • For each model, the subset of features that yielded
clinicians usually look for in medical applications. After a the highest precision was taken
thorough discussion, it was clear that it is important to reduce • For each classifier, its corresponding precision-
the number of false positives (FP) so that patients are not put based selected subset was reintroduced into it
on a course of treatment that would do more harm than good. • The model was evaluated in terms of accuracy and
Thus, precision was chosen as the studied score since high recall as shown in Table IV
precision, as shown in the Equation 2, means that a low • The t-test was applied to the accuracy and the
number of FP was achieved. recall; the p-value was computed for each pair of
the classifiers (Table V)
= (2)
TABLE IV. DIFFERENT EVALUATION METRICS VALUES AFTER
INTRODUCING THE BEST PRECISION FEATURE VECTORS INTO THE SELECTED
Where TP and FP and FN are the number of true positives CLASSIFIERS
and false negatives correspondingly [17]. An example of the Classifier Polynomial
results obtained after implementing the SFFS algorithm can Linear SVM KNN
Metric SVM
be seen in Figure 3. Precision 0.936645 1 0.990909
Recall 0.8066 0.295221 0.573897
Accuracy 0.916 0.7697 0.8067
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on March 20,2024 at 03:13:26 UTC from IEEE Xplore. Restrictions apply.
211
2021 Sixth International Conference on Advances in Biomedical Engineering (ICABME)
medical features or tested their results on a low number of • Building our own database with a larger number of
subjects using hold-out method (Table VI). In our case, on subjects
the other hand, we were able to achieve high results on a high
number of patients while taking into consideration reliable REFERNCES
features. [1] S. Palomba, Infertility in Women with Polycystic Ovary Syndrome,
Springer, 2018.
TABLE VI. COMPARISON BETWEEN OUR PROPOSED MODEL AND OTHER
[2] A. Denny, A. Raj, A. Ashok, C. Ram and R. George, "i-HOPE:
RESEARCH WORK
Detection And Prediction System For Polycystic Ovary Syndrome
(PCOS) Using Machine Learning Techniques," in IEEE Region 10
Work Features No Evaluation Accuracy Precision International Conference TENCON, 2019.
of Method % %
Subjects [3] H. Vassalou, M. Sotiraki and L. Michala, "PCOS diagnosis in
Mehrotra Clinical & adolescents: the timeline of a controversy in a systematic review," J
et al metabolic 200 Hold out 91.04 82.5 Pediatr Endocrinol Metab, vol. 32, no. 6, pp. 549-559, 2019.
[4] C. Dennett and J. Simon, "The Role of Polycystic Ovary Syndrome in
Vikas et Lifestyle & Reproductive and Metabolic Health: Overview and Approaches for
al food intake 119 Hold out 97.65 95 Treatment," Diabetes Spectrum, vol. 28, no. 2, pp. 116-120, 2015.
Denny et Clinical [5] T. Williams, R. Mortada and S. Porter, "Diagnosis and Treatment of
al. metabolic 541 Hold out 89 95.83 Polycystic Ovary Syndrome," Am Fam Physician, vol. 94, no. 2, pp.
imaging 106-113, 2016.
hormonal &
[6] R. Madhavan, "Prevalence of PCOS diagnoses among women with
biochemical
menstrual irregularity in a diverse, multiethnic cohort," 2015.
Proposed Clinical,
method metabolic 541 10-Fold 91.6 93.66 [7] T. Copp, E. Cvejic, K. McCaffery, J. Hersch, J. Doust, B. Mol, A.
imaging, CV Dokras, G. Mishra and J. Jansen, "Impact of a diagnosis of polycystic
hormonal & ovary syndrome on diet, physical activity and contraceptive use in
biochemical young women: findings from the Australian Longitudinal Study of
Women’s Health," Human Reproduction, vol. 35, no. 2, p. 394–403,
V. CONCLUSIONS & FUTURE PERSPECTIVES 2020.
[8] P. Mehrotra, J. Chatterjee, C. Chakraborty, G. Biswanath and S.
In this paper, we have investigated the use of Ghoshdastidar, "Automated screening of Polycystic Ovary Syndrome
classification algorithms to identify patients with Polycystic using machine learning techniques," in 2011 Annual IEEE India
Ovary Syndrome (PCOS). We studied several machine Conference, 2011.
learning algorithms on a large dataset using cross-validation [9] B. Vikas, B. Anuhya, M. Chilla and S. Sarangi, "A Critical Study of
technique, statistical tests, and physician opinion. Our Polycystic Ovarian Syndrome (PCOS) Classification Techniques,"
International Journal of Clinical and Experimental Medicine, vol. 21,
objective was to maximize the precision of the model based no. 4, pp. 1-7, 2018.
on our discussion with physicians. In order to enhance the [10] P. Kottarathil, Polycystic ovary syndrome (PCOS) Dataset, Kaggle,
performance of our model, we used Sequential Forward 2020.
Floating Selection (SFFS) to select the best features. The best [11] A. O'Malley, K. Draper, R. Gourevitch, D. Cross and S. Scholle,
algorithm was found to be Linear Support Vector Machine "Electronic health records and support for primary care teamwork," J
with 24 features. Both precision and accuracy of this Am Med Inform Assoc, vol. 22, no. 2, p. 426–434, 2015.
algorithm were above 90%, whereas its recall was around [12] D. Singh and B. Singh, "Investigating the impact of data normalization
80%. on classification performance," Applied Soft Computing, vol. 97,
2019.
After realizing the importance of automating the process
[13] S. Solorio-Fernández, J. Martínez-Trinidad, J. Carrasco-Ochoa and
of PCOS diagnosis, and after reaching promising results in Y.-Q. Zhang, "Hybrid feature selection method for biomedical
terms of precision, it is clearly seen that it is crucial to move datasets," in IEEE International Conference on Computational
forward with this research, and find the right approach to Intelligence in Bioinformatics and Computational Biology, 2012.
enhance the model’s performance. Our analysis of the results [14] B. Remeseiro and V. Bolon-Canedo, "A review of feature selection
showed that the classifiers had relatively high precision but methods in medical applications," Comput Biol Med, vol. 112, pp. 1-
9, 2019.
did not perform as well when it came to recall. Therefore,
there was a need to find a tradeoff between precision, [15] E. Ostertagová and O. Ostertag, "Methodology and Application of
Oneway ANOVA," American Journal of Mechanical Engineering,
accuracy, and recall. For that, we propose proceeding with vol. 1, no. 7, pp. 256-261, 2013.
the research as below: [16] Y. Wah, N. Ibrahim, H. Abdul Hamid, S. Abdul-Rahman and S. Fong,
"Feature selection methods: Case of filter and wrapper approaches for
• Relying on accuracy as the evaluation metric of the maximising classification accuracy," Pertanika Journal of Science
and Technology, vol. 26, no. 1, pp. 329-340, 2018.
proposed model
[17] M. Marino, Y. Li, M. Rueschman, J. Winkelman, J. Ellenbogen, J.
• Applying the SFFS wrapper to the above classifiers Solet, H. Dulin, L. Berkman and O. Buxton, "Measuring sleep:
to improve the accuracy metric accuracy, sensitivity, and specificity of wrist actigraphy compared to
• Computing the precision and recall for each classifier polysomnography," Sleep, vol. 36, no. 11, pp. 1747-1755, 2013.
when given their own optimal set of features [18] D. Anguita, A. Ghio, S. Ridella and D. Sterpi, "K-Fold Cross
Validation for Error Rate Estimate in Support Vector Machines," in
(according to accuracy) The 2009 International Conference on Data Mining, Las Vegas, USA,
• Choosing the classifier with the optimal performance 2009.
with respect to the three metrics through method of
elimination
Authorized licensed use limited to: Mukesh Patel School of Technology & Engineering. Downloaded on March 20,2024 at 03:13:26 UTC from IEEE Xplore. Restrictions apply.
212