Professional Documents
Culture Documents
ScienceDirect
Original Article
a
Department of Pediatrics, National Cheng Kung University Hospital, College of Medicine, National
Cheng-Kung University, Tainan, Taiwan
b
Department of Pediatrics, E-Da Hospital, College of Medicine, I-Shou University, Kaohsiung, Taiwan
Received 28 June 2021; received in revised form 1 September 2021; accepted 24 September 2021
KEYWORDS Background/Purpose: The in-hospital length of stay (LOS) among very-low-birth-weight (VLBW,
Machine learning; BW < 1500 g) infants is an index for care quality and affects medical resource allocation. We
Length of stay; aimed to analyze the LOS among VLBW infants in Taiwan, and to develop and compare the per-
Retrospective study; formance of different LOS prediction models using machine learning (ML) techniques.
Very-low-birth- Methods: This retrospective study illustrated LOS data from VLBW infants born between 2016
weight infants and 2018 registered in the Taiwan Neonatal Network. Among infants discharged alive, contin-
uous variables (LOS or postmenstrual age, PMA) and categorical variables (late and non-late
discharge group) were used as outcome variables to build prediction models. We used 21 early
neonatal variables and six algorithms. The performance was compared using the coefficient of
determination (R2) for continuous variables and area under the curve (AUC) for categorical
variables.
Results: A total of 3519 VLBW infants were included to illustrate the profile of LOS. We found
59% of mortalities occurred within the first 7 days after birth. The median of LOS among sur-
viving and deceased infants was 62 days and 5 days. For the ML prediction models, 2940 infants
were enrolled. Prediction of LOS or PMA had R2 values less than 0.6. Among the prediction
models for prolonged LOS, the logistic regression (ROC: 0.724) and random forest (ROC:
0.712) approach had better performance.
* Corresponding author. Department of Pediatrics, National Cheng Kung University Hospital, College of Medicine, National Cheng-Kung
University, No.138, Sheng Li Road Tainan, Taiwan. Fax: þ886 6 2753083.
E-mail address: ped1@mail.ncku.edu.tw (Y.-J. Lin).
https://doi.org/10.1016/j.jfma.2021.09.018
0929-6646/Copyright ª 2021, Formosan Medical Association. Published by Elsevier Taiwan LLC. This is an open access article under the CC
BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
W.-T. Lin, T.-Y. Wu, Y.-J. Chen et al.
Conclusion: We provide a benchmark of LOS among VLBW infants in each gestational age group
in Taiwan. ML technique can improve the accuracy of the prediction model of prolonged LOS of
VLBW.
Copyright ª 2021, Formosan Medical Association. Published by Elsevier Taiwan LLC. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
The survival rates and outcome of very-low-birth-weight Patients and data collection.
(VLBW, BW < 1500 g) infants have improved with advances The Taiwan Neonatal Network (TNN) was initiated by the
in neonatal resuscitation and management after birth.1e3 Taiwan Society of Neonatology in 2016. The purpose of this
As the survival rate improved, medical resources to care network is to record the clinical information of infants
for these VLBW infants increased. In a population-based whose gestational age was less than 30 weeks or whose
study in California, VLBW infants accounted for 0.9% of birth body weight was less than 1500 g in order to improve
cases but 35.7% of hospital costs.4 In-hospital length of stay quality of care. During the study period, there are 28 hos-
(LOS) is one of the factors used to determine the cost and pitals participating in the network, including 17 medical
an index of quality of care.5 There are very few studies on centers and 11 community hospitals in Taiwan. We used
LOS for VLBW infants in Taiwan, so in this study, we first anonymous data from preterm infants born between 2016
illustrated the profile of LOS among VLBW infants in and 2018 enrolled in the TNN.
Taiwan. This study has been approved by the Institutional Review
Meanwhile, accurate predictions of LOS in early life Board of National Cheng Kung University Hospital (B-ER-
would be helpful for resource planning and family coun- 109-090).
seling.6 It would also be beneficial for both parents and
physicians if an accurate prediction of LOS could be made Data exclusion
early in the infant’s life. On the other hand, “overstay in
hospital” indicates unsuitable utilization of medical re- We excluded infants based on the following criteria:
sources. It was defined as patients staying longer than 30
days in the hospital based on the Taiwan National Health 1. Birth body weight more than 1500 g.
Insurance definition. Although this rule is not appropriate 2. Unknown gestational age or gestation age less than 22
in VLBW infants, it is important to distinguish what weeks
patients may have prolonged LOS to improve quality of care 3. Admission day after 7 days of birth.
and develop efficient medical resource planning. In the 4. Severe congenital anomaly.
previous study, the discharge day of preterm infants was 5. Still in hospital after reaching 1 year of age
estimated to be around their estimated date of confine- 6. Transferred to another hospital.
ment (EDC).7
Previous studies regarding predicting LOS in preterm Finally, patients with missing data and deceased infants
infants were based on conventional statistical were also excluded for the purpose of establishing the LOS
techniques.7e10 These conventional statistical techniques prediction model using a ML technique.
relied on the hypothesis to determine potential risk fac-
tors.11 In 2010, Hintz et al. concluded in their study that,
“prediction of early or late discharge is poor if only peri- Profile of LOS
natal factors are considered.8”
Machine learning (ML) is a novel analytic tool that uses We illustrated the profile of LOS using mortality and
computers to learn from labeled data rather than a pre- gestational age. In this study, LOS was defined as the
programmed process. It can be used to analyze a large number of days post-birth to the infant’s discharge day
number of variables and describe relationships between the (alive or dead).
variables and outcomes using different ML algorithms in
complex, nonlinear ways.12,13 The application of ML to Variables for ML
building LOS prediction models in different medical fields
has led to promising results.14,15 However, there are few For infants discharged alive, 21 variables during the ante-
reports using ML algorithms to predict LOS in VLBW infants. natal, perinatal, and early neonatal periods were used to
The second purpose of this study, therefore, is to develop build the LOS prediction model.
LOS prediction models for VLBW infants built using different Antenatal variables included gestational age, gender,
ML algorithms and to compare their performance. birth bodyweight, small for gestational age, antenatal
1142
Journal of the Formosan Medical Association 121 (2022) 1141e1148
1143
W.-T. Lin, T.-Y. Wu, Y.-J. Chen et al.
Random forest consists of a multitude of decision trees area). By day 200, almost all infants had been discharged
to avoid the problem of overfitting of one decision (dark-grey area). Notably, the median LOS for infants who
tree.12,13,19,20 died was less than 7 days, especially for those infants with
gestational ages less than 25 weeks, which accounted for
Model evaluation and validation two-thirds of the deceased infants. There were 235 infants
who died (59.3% of all mortalities) within 7 days and 327
To use all available data for training our model, instead of infants who died (82.5% of all mortality) within 30 days. The
splitting the data into training and testing groups, 10-fold number of infants who died on each day in the first 30 days
cross-validation was used. In the 10-fold cross-validation, is shown in Fig. 3.
the data was randomly partitioned into 10 subgroups. Then,
one subgroup was used as the testing group, and the other 9 Comparison of EDC and LOS in surviving infants
subgroups were used as the training groups to build the
prediction model. The process was repeated 10 times using We compared the discharge day of the surviving infants
different subgroups as testing and training groups, and the with their EDC. For infants with the gestational ages
average performance was recorded. Finally, the 10 ranging from 22 to 24 weeks, the median LOS was about
different models were then averaged to obtain the re- 2e3 weeks after EDC. As their gestational age increased,
ported result. the infants were discharged close to or even in advance of
their EDC. Infants with gestational ages ranging from 25 to
Other statistical analyses 27 weeks were discharged around their EDC. Infants with a
gestational age over 28 weeks were discharged 2e3 weeks
before their EDC.
Statistical analyses of patient characteristics were per-
formed using a chi-squared test. Variables with p-values
below 0.05 were considered statistically significant.
The Weka Experiment Environment was used to analyze
the results and determine if one algorithm was statistically
better than the others. Statistical significance refers to the
result of a pairwise comparison of the algorithms using a
standard t-test or the corrected resamples t-test.21 We
used the coefficient of determination (R2) to compare the
predictive capabilities of ML with continuous target vari-
ables. The area under the curve (AUC) of the receiver
operating characteristics (ROC) curve was used as the main
target for the ML performance with the categorical target
variables.8,22
Results
Study group
Data from 3791 infants born between 2016 and 2018 in TNN
were screened for eligibility. After exclusion, there were
3519 infants enrolled (Fig. 1).
For all 3519 patients, the mortality rate during the hospital
stay was 11.2%. There were 1825 (51.8%) male infants. The
greatest gestational age was 37 weeks. To ensure privacy,
the actual birth bodyweight of each infant was not recor-
ded; instead, the birth body weight was recorded as 100 g
categories, such as 1000e1100 g. Thus, we were unable to
show the actual mean birth bodyweight of the infants. The
median LOS of all patients was 61 days. The summary of the
LOS profile categorized by mortality and gestational age is
shown in Table 1.
We plotted the cumulative cases against LOS and
demonstrated the result in graphic form,23 as shown in Figure 1 Flow chart illustrating patient selection. GA,
Fig. 2. On day 10, there were no infants discharged, and gestational age; LOS, length of stay; TNN, Taiwan Neonatal
most mortalities occurred before that period (light-gray Network.
1144
Journal of the Formosan Medical Association 121 (2022) 1141e1148
Predicting hospital discharge as non-late or late information related to their baby’s likely discharge date, it
discharge improved their perceptions of their baby’s clinical condi-
tion.6 Since mortality is an important determinant of LOS,
The characteristics of the infants in each group are shown and there are high mortality rates (11.2%) among these
in Table 3. Except for gestational age, which was adjusted patients, it is important to give this information at an
in each group, among 20 variables, there were 12 variables appropriate time.
with significant differences: gender, birth bodyweight, In clinical practice, we can tell the parents that the
small for gestational age, maternal hypertension or gesta- baby’s risk of mortality will decrease if the baby is stable
tional hypertension, cesarean section, APGAR score 1-min after seven-day-old, as well as the estimated LOS based on
and 5-min, oxygen at DR, intubation at DR, chest their gestational age. As shown in Table 1, infants with
compression at DR, early sepsis, and surfactant use. gestational ages between 22 and 24 weeks may be dis-
The performance of each ML algorithm is shown in Table 4. charged 2e3 weeks after their EDC. As the week of gesta-
Compared with the other models, the models built using tional age increases to 27 weeks, the LOS may be shorter
the logistic regression (AUC Z 0.724) and using the random than the time remaining until the EDC. Our results were
forest (AUC Z 0.712) had statistically significant higher AUC similar to a large-population-based study in England, where
values. infants born at 25e26 weeks were discharged around their
EDC.7
The second part of this study illustrated that ML can help
Discussion build accurate models for predicting prolonged LOS among
VLBW infants in their early life but not for exact LOS and
In this study, the profile of LOS grouped by mortality and PMA as a continuous variable. For the R2 values of less than
gestational age among VLBW infants in Taiwan was first 0.6, there were also high MAE and RMSE in the predictive
illustrated. According to the results, most mortalities models for exact LOS and PMA. For predicting prolonged
(59.3% of all mortality) occurred within the first 7 days of LOS, models built using a logistic regression and the random
life. Ingram et al. reported that if parents received more forest had better performance in this study, with AUC
values of 0.724 and 0.712, respectively. This result was
better than the results of a previous report by Hints et al.,
with AUC values ranging from 0.56 to 0.69 in their study.8
Also, one of the strengths of ML is that it can be self-
taught with future added data.13 We believe that as the
patient number increases in the future, more information
on VLBW infants can be provided to the computer, so the
prediction model will become more accurate.
The median LOS of all VLBW infants was 61 days, and
there was significant variability in the LOS among the
different gestational ages, indicating that the current
definition of “overstay in hospital” by the Taiwan National
Health Insurance, which is defined as a LOS longer than 30
days, is not appropriate in this population. However,
“overstay in hospital” is still an important quality index of
Figure 2 Stacked cumulative cases of death (light-gray hospital performance. Meanwhile, although these infants
area), in hospital (black area), and discharge (dark-grey area). are high-risk for early readmission,24 previous studies have
LOS, length of stay. shown early discharge is not associated with higher
1145
W.-T. Lin, T.-Y. Wu, Y.-J. Chen et al.
Table 2 The performance of each machine learning algorithm for LOS and PMA.
Linear regression Multilayer perceptron LIBSVM IBK REPTree Random Forest
2
LOS R 0.57 0.345 0.572 0.309 0.56 0.573
MAE 15.40 20.64 14.50 21.82 15.46 15.35
RMSE 25.73 38.07 26.40 37.48 26.03 25.65
PMA R2 0.243 0.062 0.241 0.063 0.244 0.243
MAE 15.40 20.94 14.50 21.35 15.34 15.38
RMSE 25.73 38.04 26.40 36.77 25.71 25.82
LOS, length of stay; PMA, postmenstrual age; LIBSVM, support vector machine; IBK, k-nearest neighbors; REPTree, decision tree; MAE,
mean absolute error; R,2 coefficient of determination; RMSE, root mean square error.
readmission frequency.25 Our results may provide a refer- distinguish infants with higher risk for prolonged LOS for
ence for determining “overstay in hospital” in Taiwan general resource planning. Physicians can plan in advance
among this population. to have adequate beds, equipment, and medical staff in
An accurate prolonged LOS prediction model can help the neonatal intensive care units as soon as a VLBW infant
improve quality of care in clinical practice. For example, is born.
if a patient was predicted to be “non-late discharge” Furthermore, a previous study showed that later mor-
after birth but stayed in the hospital longer than ex- bidities, such as bronchopulmonary dysplasia or surgery for
pected, we should analyze the reasons why the prolonged necrotizing enterocolitis, are factors contributing to the
LOS occurred. On the other hand, physicians need to predictive models for LOS.8 However, later morbidities
1146
Journal of the Formosan Medical Association 121 (2022) 1141e1148
Table 4 The performance of each machine learning algorithm for the target categorical variables.
Logistic regression Multilayer perceptron LIBSVM IBK REPTree Random Forest
a a a a
AUC 0.724 0.647 0.591 0.574 0.686 0.712
Precision 0.761 0.713a 0.794 0.681a 0.745 0.744
Recall 0.783 0.736a 0.791 0.687a 0.774 0.773
F-measure 0.744 0.721a 0.739 0.684a 0.733 0.739
AUC, area under the curve; LIBSVM, support vector machine; IBK, k-nearest neighbors; REPTree, decision tree.
a
Statistical significance compared with the logistic regression (p-value<0.05).
were mostly observed later in this study population and in their early life. Further study is needed to externally vali-
after the great majority of the deaths had already date these models in terms of predicting LOS.
occurred. In practice, it is appropriate to include these
later morbidities to build an explanatory model, but it is Declaration of any potential financial and
inappropriate to build a prediction model for LOS.26
There are several limitations in this study. This study
nonfinancial conflicts of interest
included 17 medical centers and 11 community hospitals in
Taiwan. The individual discharge criteria for each hospital 1. This work has not been supported by any funding.
may vary and is a confounding factor in this study. Ideally, 2. The authors have no conflicts of interest relevant to this
each hospital should have a predictive LOS model for its article.
patient population and distinctive discharge criteria, but
the low patient number in each hospital made it difficult to Acknowledgments
develop a robust model. We attempted to illustrate the LOS
among VLBW infants in Taiwan at a national level to serve We thank for members and administrator of Taiwan
as a reference for acceptable LOS. To build prediction Neonatal Network and the research nurses and residents of
models, we should provide all collectible variables for ML to the 28 participating hospitals for their help in the regis-
avoid missing important predictors. There were, however, tration and data collection.
some variables unavailable in this study. In the TNN data-
base, some of the important variables were recorded
without the actual time when the event occurred such as References
intraventricular hemorrhage, surgical necrotizing entero-
colitis and treatment for patent ductus arteriosus. Some 1. Chen YJ, Yu WH, Chen LW, Huang CC, Kang L, Lin HS, et al.
variables were not recorded in the database such as Improved survival of periviable infants after alteration of the
maternal gestational diabetes mellitus, formula feeding or threshold of viability by the neonatal resuscitation program
2015. Children 2021;8.
breast milk feeding, and use of probiotics. Finally, due to
2. Chang JH, Hsu CH, Tsou KI, Jim WT. Outcomes and related
the limited patient number, we used all patient data to factors in a cohort of infants born in Taiwan over a period of
build the prediction model. Although a 10-fold cross- five years (2007-2011) with borderline viability. J Formos Med
validation was used to avoid the problem of over-fitting in Assoc 2018;117:365e73.
ML, we did not have an external validation for the 3. Su YY, Wang SH, Chou HC, Chen CY, Hsieh WS, Tsao PN, et al.
model evaluation. In a future study, we hope to have more Morbidity and mortality of very low birth weight infants in
patients with more variables to improve the accuracy of the Taiwan-Changes in 15 years: a population based study. J For-
proposed prediction model. mos Med Assoc 2016;115:1039e45.
ML is a novel technology in many fields, including 4. Schmitt SK, Sneed L, Phibbs CS. Costs of newborn care in
building prediction models in the field of neonatology. California: a population-based study. Pediatrics 2006;117:
154e60.
Further study is needed to validate the use of ML in pre-
5. DeRienzo C, Kohler JA, Lada E, Meanor P, Tanaka D. Demon-
diction models in this special population. strating the relationships of length of stay, cost and
clinical outcomes in a simulated NICU. J Perinatol 2016;36:
Conclusion 1128e31.
6. Ingram JC, Powell JE, Blair PS, Pontin D, Redshaw M, Manns S,
et al. Does family-centred neonatal discharge planning reduce
An accurate LOS prediction model in a VLBW baby’s early life
healthcare usage? A before and after study in South West En-
can be used in the parents counseling and in general resource gland. BMJ Open 2016;6:e010752.
planning for physicians. Our study provides a benchmark of 7. Seaton SE, Barker L, Draper ES, Abrams KR, Modi N,
LOS among VLBW infants in each gestational age group in Manktelow BN, et al. Estimating neonatal length of stay for
Taiwan. We also demonstrated that the ML technique can babies born very preterm. Arch Dis Child Fetal Neonatal Ed
improve the accuracy of the prediction mode of prolonged LOS 2019;104:F182e6.
1147
W.-T. Lin, T.-Y. Wu, Y.-J. Chen et al.
8. Hintz SR, Bann CM, Ambalavanan N, Cotten CM, Das A, 18. Witten IH, Frank E, Hall MA, Pal CJ. The WEKA workbench.
Higgins RD, et al. Predicting time to hospital discharge for Online appendix for "data mining: practical machine learning
extremely preterm infants. Pediatrics 2010;125:e146e54. tools and techniques. ed. Morgan Kaufmann; 2016.
9. Manktelow B, Draper ES, Field C, Field D. Estimates of length of 19. Deo RC. Machine learning in medicine. Circulation 2015;132:
neonatal stay for very premature babies in the UK. Arch Dis 1920e30.
Child Fetal Neonatal Ed 2010;95:F288e92. 20. Senders JT, Staples PC, Karhade AV, Zaki MM, Gormley WB,
10. Lee HC, Bennett MV, Schulman J, Gould JB, Profit J. Estimating Broekman MLD, et al. Machine learning and neurosurgical
length of stay by patient type in the neonatal intensive care outcome prediction: a systematic Review. World Neurosurg
unit. Am J Perinatol 2016;33:751e7. 2018;109:476e86. e1.
11. Ma H, Xu CF, Shen Z, Yu CH, Li YM. Application of machine 21. Bouckaert RR, Frank E, Hall M, Kirkby R, Reutemann P,
learning techniques for clinical predictive modeling: a cross- Seewald A, et al. WEKA manualfor version 3-8-0. 2016.
sectional study on nonalcoholic fatty liver disease in China. p. 55e96.
BioMed Res Int 2018;2018:4304376. 22. Flach P, Hernández-Orallo J, Ferri C. A coherent interpretation
12. Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N of AUC as a measure of aggregated classification performance.
Engl J Med 2019;380:1347e58. In: Proceedings of the 28th international conference on in-
13. Heo J, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine ternational conference on machine learning. Bellevue, Wash-
learning-based model for prediction of outcomes in acute ington, USA: Omnipress; 2011. p. 657e64.
stroke. Stroke 2019;50:1263e5. 23. Lambert PC, Wilkes SR, Crowther MJ. Flexible parametric
14. Daghistani TA, Elshawi R, Sakr S, Ahmed AM, Al-Thwayee A, Al- modelling of the cause-specific cumulative incidence function.
Mallah MH. Predictors of in-hospital length of stay among Stat Med 2017;36:1429e46.
cardiac patients: a machine learning approach. Int J Cardiol 24. Tseng YH, Chen CW, Huang HL, Chen CC, Lee MD, Ko MC,
2019;288:140e7. et al. Incidence of and predictors for short-term readmission
15. Muhlestein WE, Akagi DS, Davies JM, Chambless LB. Predicting among preterm low-birthweight infants. Pediatr Int 2010;52:
inpatient length of stay after brain tumor surgery: developing 711e7.
machine learning ensembles to improve predictive perfor- 25. Lamarche-Vadel A, Blondel B, Truffer P, Burguet A,
mance. Neurosurgery 2019;85:384e93. Cambonie G, Selton D, et al. Re-hospitalization in infants
16. Heinemann J. Machine learning in untargeted metabolomics younger than 29 weeks’ gestation in the EPIPAGE cohort. Acta
experiments. Methods Mol Biol 2019;1859:287e99. Paediatr 2004;93:1340e5.
17. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, 26. Sainani KL. Explanatory versus predictive modeling. PM & R :
Witten IH. The WEKA data mining software: an update. SIGKDD the journal of injury, function, and rehabilitation 2014;6:
Explor Newsl 2009;11:10e8. 841e4.
1148