You are on page 1of 9

Prediction of Fetal Health Status Using Machine Learning

Naidile S Saragodu[1], Shreedhara N Hegde[2]

Student,Master of Computer Applications, Dayananda Saga of Technology & Management, Karnataka, India naidilessaragodu99@gmail.com
[1]

[2]
Assistant Professor, Master Of Computer Applications, Dayananda Sagar Academy of Technology and Management Karnataka, India, shreedhara-
mca@dsatm.edu.in

ABSTRACT
The goal of this promising area of study is to enhance prenatal care and lower fetal morbidity and mortality by
utilising machine learning to anticipate fetal disease. In this study, we present a machine learning-based strategy
for predicting foetal diseases from clinical data. First, we gathered a sizable collection of clinical information from
expectant mothers with various fetal disorders. Using clinical guidelines, we pre-processed the data and retrieved
pertinent features. We integrated a range of machine learning algorithms, including logistic regression, support
vector machines, decision trees, and random forests, to train and test our model. We evaluated the performance of
our model using a number of factors, such as accuracy, sensitivity, specificity, and area under the receiver
operating characteristic curve (AUC-ROC).The results of this study demonstrate how machine learning algorithms
can accurately forecast fetal health status. The developed models achieve good accuracy and AUC-ROC ratings
in order to distinguish between healthy and at-risk fetuses. The interpretability study identifies key clinical
characteristics that have a significant impact on the prediction, providing medical practitioners with useful
information when making decisions about prenatal care.Through the provision of more unbiased and precise
assessments of fetal health status, machine learning techniques incorporated into prenatal care have the potential
to transform the industry. By providing accurate and early projections, this technology can assist healthcare
professionals in identifying high-risk pregnancies and carrying out the necessary procedures, improving mother
and fetal outcomes. Future research should concentrate on verifying and improving predictive models on larger
and more varied datasets to ensure real-world applicability and reliability

Keywords - SVM, LR, Random Forest Classification, Fetal Health, Machine Learning.

1. INTRODUCTION
Fetal health is the state of the unborn child. In the United States, the fetal death rate in 2019 was roughly 5.6, or
roughly 6 newborns per 1000.based on the sources, infants. The rate of fetal mortality can be reduced by being
aware of fetal health issues and taking the appropriate safety measures. The health of the fetus can also be inferred
from other continuous aspects like acceleration, baseline value, histogram mean, kicks, etc., although there are
several ways to do so depending on how these values are calculated. Therefore, our technique comprises utilizing
and applying machine learning principles to the dataset we have in order to forecast fetal health with a focus on
accuracy. The phrase "dataset" describes a group of records. These records include the values that give the data
some context and are essential in determining the health of the infant. For this prediction, the Kaggle dataset is
being used. The dataset contains about 21 features and 21 records, but we only take into account 5 characteristics
in our forecast. This process is known as preprocessing. If we offer these five features as its input, the Random-
Forest-Classifier algorithm will provide the result as Normal, pathologic, and suspect.

Pre-processing, also known as data pre-processing, is the process of choosing the data and excluding all
extraneous aspects. Pre-processing is the process of choosing features based on how much of a contribution they
make to health prediction. Pre-processing is the process of choosing features based on how much of a contribution
they make to health prediction. We additionally clean our datasets during this process by deleting data with null
feature values in order to obtain accurate and unambiguous data. The Random Forest Classifier is one machine
learning approach for categorization and prediction. Random forest classifiers employ a large number of decision
trees that are built from different subsets of the dataset and take into account the average in order to improve the
dataset's predictive accuracy. The Decision Tree Classifier and Random Forest Classifier differ primarily in two
ways. The first one talks about the training set's data. The Decision Tree Classifier only needs one training set,
whereas the Random-Forest-Classifier needs many training datasets. Another distinction is the number of trees
required for prediction.
Based on the number of training datasets, Random Forest Classifiers require more than one tree for prediction,
but Decision Tree Classifiers only require one tree.Because there is only one tree in the Decision Tree Classifier,
there is only one output. Although there are more outputs since the Random-forest-Classifier employs more trees,
the majority of these outputs will be used to produce the final result. When we use the majority, we can anticipate
the result more precisely and with less overfitting. Overfitting is the phrase used in data science to describe the
situation where a built-in model perfectly matches its training data. MongoDB is the database we employed for
the implementation.

We are keeping the user's details, login information, and report values in the database for prediction. Because the
forecast data will be saved and available to the user, they can view or download the previous outcome.This work
suggests finding the anomalies using machine learning-based categorization algorithms for early identification
and prediction of fetus abnormalities. Due to advancements in image classification methods and machine learning
techniques, it is now substantially easier to identify abnormalities during the first trimester of pregnancy.

2. RELATED WORKS

In this study, we used clinical records of the fetus to forecast the fetal status and determine if the condition
is normal, suspect, and pathologic using a machine learning approach. According to the poll, we have developed
a number of methods for assessing the health of the fetus.

The research provides a detailed literature analysis that covers cutting-edge methodologies for predicting
fetal health, including data gathering methods, machine learning techniques, and pertinent feature extraction
techniques. Numerous research have investigated the application of conventional machine learning techniques for
predicting fetal health, including logistic regression, decision trees, and random forests. The suggested enhanced
random forest algorithm, however, is a fresh strategy that outperforms traditional approaches[1].

By examining cardiotography data, the authors of this study want to categorise the fetal status into groups
like normal, suspicious, or abnormal. Cardiotography is a method used to track the fetal heart rate and uterine
contractions throughout pregnancy, giving important insights into the health of the fetus[2].

The authors suggest a hybrid deep learning system for predicting fetal health status using cardiotocographic data.
This approach combines the Alex Net convolutional neural network (CNN) architecture with a support vector
machine (SVM). Cardiotocography (CTG) is a method for evaluating the health of the fetus during pregnancy by
keeping track of the fetal heart rate and uterine contractions[5].
.
With the use of CTG recordings, which are assessments of the fetal heart rate and uterine contractions, the author
of this study attempts to forecast the fetal status. The goal is to create a precise model that can forecast the fetal
status and give obstetricians useful data to track the fetus' health[8].

3. PROPOSED WORK
Creating a report and examining the findings of tests performed on expectant women to identify and assess the
condition of the fetus. The answer will be provided through a website created using a random forest classifier
prediction algorithm based on datasets and the outcomes of countless prior tests. by determining the health,
abnormality, or pathology of the fetus using the findings of tests conducted on pregnant women as input.
• Getting a clean dataset and obtaining the dataset.
• trying different things with the dataset to increase accuracy.
• to design an intuitive user interface.
• To use the provided input to create a PDF of a package's results.
• To send the user's email with the PDF.
Fig1. Learning model flow Fig2. Proposed learning model flow

Figure 1 depicts the learning model's flowchart, while Figures 2 depict the suggested model and system
architecture. Using any random subset of the training dataset, the random forest classifier creates a series of
decision trees. To choose the final class for the test objects, it integrates the votes from several decision trees.

Table1. Class distribution of CTGs Fig3. Graphical representation of fetal dataset

1. Data Preparation
The fetal health dataset is used in a variety of studies. Due to the dearth of fetal data, the model could
only be managed by combining the information gathered from reliable sources. As a result, the fetal dataset was
used to produce the dataset. This information is used to forecast the health of the fetus using a number of fetal
features. Applications for machine learning and data visualization use the filtering method to choose a portion of
the initial data.

2. Data Pre-processing
A dataset is created using different paradigms. Understanding the fundamental features of an item allows
one to differentiate it using a collection of features. Eliminating the empty values that cause a drag when retrieving
the accuracy would be the first step in improving the dataset quality. The dataset's structure is also important to
take into account and motivate the feature engineering process to begin.

3. Feature Engineering
The process heavily relies on feature engineering. First, visualization clarifies the many health aspects
and assists in determining their necessity. The model is then applied to each attribute in relation to both the
opposing characteristics and the attribute in question.

4. Prediction Algorithm
It is based on (above A, B, and C) to assess the fetal health state and establish whether it is questionable,
abnormal, or normal in circumstances when maternal clinical data was in question. There are categorization
algorithms in use. To achieve the high prediction accuracy, a sufficient amount of datasets with over 100,000
observations of suspects, normal, and pathological data, can be acquired. An algorithm with high prediction
accuracy is used to construct each classifier's accuracy. Table 1 displays the development environment for the ML-
based Conventional Classifier. The following variables are used to predict the fetal health state.

4.MODEL
1.Random Forest
It is one of the most potent algorithms since it can generate precise models from small training samples. The more
trees there are in a random forest, the more accurate the forecast will be since each tree can identify specific
subtleties of the data that the other trees could have missed. Additionally, the random forest's architecture lessens
the chance of over fitting by randomly choosing subsets of data to train each tree on, ensuring that no single input
disproportionately influences any given tree. One of the most well-liked and frequently applied algorithms among
data scientists is random forest. Using their average in the case of regression and their majority vote in the case of
classification, it builds decision trees from a variety of samples.

Fig4. Random forest classifier example Fig5. Random forest classifier working

A random forest classifier's key attributes are stability, parallelization, train-test split, resilience to the
dimensionality curse, and random forest variety. The hyperparameters estimators, max features, and mini sample
leaf boost the prediction power.

2. Logistic Regression
Logistic regression is a widely used algorithm due to its simplicity, interpretability, and efficiency. It can
handle both numerical and categorical input features, although appropriate feature scaling and handling of
categorical variables may be necessary. Additionally, logistic regression can be extended to handle multiclass
classification problems using techniques like one-vs-rest or multinomial logistic regression.

Fig6. Logistic Regression


3.Support Vector Machine
The SVM kernel can be used to find a nonlinear decision boundary that can precisely classify data points that
cannot be separated linearly. SVM can successfully divide the data into two classes by employing the support
vectors to establish a decision boundary. It is therefore the perfect tool for classifying complicated data sets, such
as PD voice data.

4. Decision Tree
A well-liked machine learning method for classification and regression applications is the decision tree algorithm.
It creates a model in the shape of a tree, where each leaf node stands in for a class label or a predicted value, each
internal node for an attribute or feature, and each branch for a choice based on that attribute.

5.IMPLEMENTATION

Data gathering: Compile a dataset with pertinent details regarding the condition of the fetus. This information may
include particulars like cardiotocogram (CTG) readings, markers of maternal health, and other pertinent details.
Make sure the dataset is representative and diversified.

Data Preprocessing: Prepare the obtained data for machine learning model training by cleaning and preprocessing
it. In this step, missing values are handled, numerical characteristics are normalized, categorical variables are
encoded, and any other required data transformations are carried out.

Feature Selection / Engineering: Analyze the dataset and choose or create features that are most likely to be useful
for predicting the health state of the fetus. This stage can need subject-matter expertise and consulting with medical
professionals.

Model selection: Select a suitable machine learning method to forecast the health state of the fetus. For
classification tasks, the methods decision trees, random forests, support vector machines (SVM), and neural
networks are often used. When choosing the model, take into account the problem's particular requirements as
well as the features of the dataset.

Model Training: Split the preprocessed data into training and validation sets for the model. Use the selected
algorithm to train the selected model on the training set. Hyperparameters should be adjusted as necessary to
improve model performance. To prevent overfitting, use methods such as regularization and cross-validation.

Model Evaluation: Evaluate the performance of the trained model using the validation set. For classification tasks,
standard assessment metrics include accuracy, precision, recall, F1 score, and area under the receiver operating
characteristic curve (AUC-ROC).

Model Deployment: When you are pleased with the model's performance, you can use it to produce forecasts
based on fresh, unforeseen data. This may entail developing an application or incorporating the model into a
process that already exists.

Ongoing Monitoring & Maintenance: Continue to keep an eye on how the deployed model is performing and
make any necessary updates as soon as new data become available. Keep up with the most recent research and
techniques in the field of fetal health status prediction to ensure that the model remains accurate and helpful.
6.RESULTS & DISCUSSION

While solving predictions involving unstructural data. Our data were modelled, and the accuracy is 98%. We've
had good accuracy thus far, as shown in fig7.

Fig7. Accuracy & loss graph for Random forest

Fetal heart rate, acceleration, mobility, uterine contractions, and general fetal health are tracked after applying ML
algorithms to various datasets. Figure 8 depicts the fetal heart rate at rest; less than 110 bpm is deemed "abnormal"
or "slow heart rate," and the range of 110 to 160 bpm is regarded "normal."

Fig8. Baseline Value

The fetal heart rate acceleration per second is depicted in Figure 9. Short-term increases in heart rate that continue
at least 15 seconds or are at least 15 beats per minute in the "x-axis" are regarded "normal," whereas other instances
are deemed "abnormal." It can be found by looking at the oxygen supply relative to mm Hg in the 'y-axis'. Figure
10 shows how many fetal movements occur every second, with 10 or fewer considered "normal." The movement
is deemed "abnormal" if it is not recorded at the designated time. The 'normal' condition for the uterine contraction
in Fig. 11 is 0.005 milliseconds. It is 0.001 milliseconds for the suspect and 0.003 milliseconds for 'pathological'
or 'abnormal' conditions. The overall impact of the factors on the fetal health status, with 1 denoting a "normal"
heart rate and anything above being seen as "suspicious" or "pathological."
Fig9. Fetal heart rate acceleration Fig10. Fetal movements

Fig11. Uterine Contraction

Based on the input of the data, our approach predicts that 92–98% of the parameters will be normal and that 2-
4% would be ambiguous or aberrant, as illustrated in Fig. 12.

Fig12. Fetal health with results Fig13.Picture of proof how we select 5 parameters
1.Short-Term Abnormal Variability: The fetal heart's short-term variability is the range of 3 to 5 beats per minute
(BPM) that it experiences from one second to the next.

2. The average of the absolute variances between subsequent intervals is the "short-term variability mean value"
of the ECG.

3. Period with an abnormal long-term variation: The fetal heart rate "acceleration" is a considerable long-term
variation.
They are frequently triggered by fetal movement, last 10 to 20 seconds or more, and are at least 15 beats per
minute quicker than the baseline.

4. Histogram mean: The histogram's mean displays the fetus's heart rate throughout various time periods.

Fig14. Acceleration value

Fig15. Short term variability Fig16. Long term variability

7.CONCLUSION

In summary, predicting fetal health status using machine learning has showed considerable potential. Researchers
and medical experts have been able to create precise models for determining the health of unborn children by
utilizing a variety of algorithms and vast datasets. These models can aid in risk assessment and early intervention,
which will eventually improve outcomes for both the mother and the fetus.
Machine learning algorithms can efficiently analyse complicated patterns and create predictions about the fetal
health status by using characteristics like maternal health records, fetal heart rate patterns, and other pertinent
information. These models may be able to spot irregularities and difficulties, such as intrauterine growth
restriction, fetal distress, and congenital defects, which can have a big impact on the fetus's health.
More individualized and pro-active methods of treating pregnancies may result from the use of machine learning
to prenatal care. Informed decisions on interventions, such as medical treatments, closer monitoring, or even early
delivery if necessary, may be made by healthcare professionals with the correct and timely knowledge about the
fetal health status.
While machine learning has showed potential in this field, it is crucial to remember that technology has its limits.
The calibre and variety of the training data have a significant impact on the models' accuracy. When applying
machine learning in healthcare, it's also important to carefully evaluate ethical issues, privacy issues, and the
requirement for interpretability in the decision-making process.
The continuing development and validation of machine learning models for predicting fetal health status depend
on research and collaboration between medical experts and data scientists. We may improve the generalizability
and reliability of these models, assuring their usefulness in realistic healthcare settings, by adding new data
sources, enhancing the generalizability of current algorithms, and performing rigorous clinical trials.
By enabling early identification and intervention for foetal health issues, machine learning has the potential to
completely transform prenatal care. Continued research in this area might lead to better results for expectant moms
and their unborn children, making pregnancy a safer and healthier time for everyone.

8.REFERENCE

[1] K. Vimala and D. Usha, “An efficient classification of congenital fetal heart disorder using improved random
forest algorithm,” Int. J. Eng. Trends Technol., vol. 68, no. 12, pp. 182–186, 2020, doi: 10.14445/22315381/IJETT-
V68I12P229.

[2] M. Z. Arif, R. Ahmed, U. H. Sadia, M. S. I. Tultul, and R. Chakma, “Decision Tree Method Using for Fetal
State Classification from Cardiotography Data,” J. Adv. Eng. Comput., vol. 4, no. 1, p. 64, 2020, doi:
10.25073/jaec.202041.273.

[3] M. Ramla, S. Sangeetha, and S. Nickolas, “Fetal HealthnState Monitoring Using Decision Tree Classifier from
Cardiotocography Measurements,” Proc. 2nd Int. Conf. ntell. Comput. Control Syst. ICICCS 2018, no. Iciccs,
pp. 1799–1803, 2019, doi: 10.1109/ICCONS.2018.8663047.

[4] P. Bhowmik, P. C. Bhowmik, U. A. M. Ehsan Ali, andM. Sohrawordi, “Cardiotocography Data Analysis to
Predict Fetal Health Risks with Tree-Based Ensemble Learning,” Int. J. Inf. Technol. Comput. Sci., vol. 13, no. 5,
pp. 30–40, 2021, doi: 10.5815/ijitcs.2021.05.03.

[5] N. Muhammad Hussain, A. U. Rehman, M. T. Ben Othman, J. Zafar, H. Zafar, and H. Hamam, “Accessing
Artificial Intelligence for Fetus Health Status Using Hybrid Deep Learning Algorithm (AlexNet-SVM) on
Cardiotocographic Data,” Sensors, vol. 22, no. 14, 2022, doi: 10.3390/s22145103.

[6] A. Subasi, B. Kadasa, and E. Kremic, “Classification of the cardiotocogram data for anticipation of fetal risks
using bagging ensemble classifier,” Procedia Comput. Sci., vol. 168, no. 2019, pp. 34–39, 2020, doi:
10.1016/j.procs.2020.02.248.

[7] K. Agrawal and H. Mohan, “Cardiotocography Analysis for Fetal State Classification Using Machine Learning
Algorithms,” 2019 Int. Conf. Comput. Commun. Informatics, ICCCI 2019, 2019, doi:
10.1109/ICCCI.2019.8822218.

[8] M. S. Iraji, “Prediction of fetal state from the cardiotocogram recordings using neural network models,” Artif.
Intell. Med., vol. 96, no. August 2018, pp. 33–44, 2019, doi: 10.1016/j.artmed.2019.03.005.

[9] M. M. Imran Molla, J. J. Jui, B. S. Bari, M. Rashid, and M. J. Hasan, “Cardiotocogram data classification
using random forest based machine learning algorithm,” Lect. Notes Electr. Eng., vol. 666, no. January, pp. 357–
369, 2021, doi: 10.1007/978-981-15-5281-6_25.

You might also like