You are on page 1of 11

| |

Received: 12 May 2022    Revised: 31 August 2022    Accepted: 14 October 2022

DOI: 10.1111/cas.15624

ORIGINAL ARTICLE

A Bayesian network predicting survival of cervical cancer


patients—­Based on surveillance, epidemiology, and end results

Guangcong Liu | Zhuo Yang | Danbo Wang

Cancer Hospital of China Medical


University, Liaoning Cancer Hospital and Abstract
Institute Shenyang, Shenyang, People's
This study aimed to build a comprehensive model for predicting the overall sur-
Republic of China
vival (OS) of cervical cancer patients who received standard treatments and to build
Correspondence
a series of new stages based on the International Federation of Gynecologists and
Danbo Wang, Cancer Hospital of China
Medical University, Liaoning Cancer Obstetricians (FIGO) stages for better such predictions. We collected the cervical
Hospital and Institute Shenyang,
cancer patients diagnosed since the year 2000 from the Surveillance, Epidemiology,
Shenyang, People's Republic of China.
Email: wangdanbo@cancerhosp-ln-cmu. and End Results (SEER) database. Cervical cancer patients who received radiotherapy
com
or surgery were included. Log-­rank tests and Cox regression were used to identify
Funding information potential factors of OS. Bayesian networks (BNs) were built to predict 3-­and 5-­year
Doctoral Start-­up Foundation of Liaoning
survival. We also grouped the patients into new stages by clustering their 5-­year sur-
Province, Grant/Award Number: 2021-­
BS-­0 41; Natural Science Foundation of vival probabilities based on FIGO stage, age, and tumor differentiation. Cox regres-
Liaoning Province, Grant/Award Number:
sion suggested black ethnicity, adenocarcinoma, and single status as risks for poorer
2021-­YGJC-­15
prognosis, in addition to age and stage. A total of 43,749 and 39,333 cases were finally
eligible for the 3-­and 5-­year BNs, respectively, with 11 variables included. Cluster
analysis and Kaplan-­Meier curves indicated that it was best to divide the patients
into nine modified stages. The BNs had excellent performance, with area under the
curve and maximum accuracy of 0.855 and 0.804 for 3-­year survival, and 0.851 and
0.787 for 5-­year survival, respectively. Thus, BNs are excellent candidates for predict-
ing cervical cancer survival. It is necessary to consider age and tumor differentiation
when estimating the prognosis of cervical cancer using FIGO stages.

KEYWORDS
5-­Year survival, Bayesian network, cervical cancer, prediction, SEER

1  |  I NTRO D U C TI O N 2018.1 Although much work has been done, its incidence and mor-
tality have shown an overall upward trend in the past few years. 2
Cervical cancer is one of the most common malignant tumors in Although the “Global strategy to accelerate the elimination of
women worldwide and accounted for more than 310,000 deaths in cervical cancer” was launched by WHO in 2018, less than 30% of

Abbreviation: AUC, area under the curve; BN, Bayesian network; DAGs, directed acyclic graphs; FIGO, International Federation of Gynecologists and Obstetricians; HPV, human
papillomavirus; K-­M , Kaplan-­Meier; ROC, receiver-­operating characteristic curve; SEER, Surveillance, Epidemiology, and End Results

Guangcong Liu and Zhuo Yang contributed equally to this paper.

This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in
any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2022 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.

Cancer Science. 2023;114:1131–1141.  |


wileyonlinelibrary.com/journal/cas     1131
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1132      LIU et al.

lower-­or middle-­income countries had introduced the human papil- 2  |  M E TH O D S


lomavirus (HPV) vaccine into their national immunization schedules
up to 2020.3 Therefore, the battle with cervical cancer will face dif- 2.1  |  Data sources and inclusion criteria
ficulties in the next few decades.
The prognosis of cervical cancer is generally optimal compared All data were acquired from the SEER database (https://seer.cancer.
with other types of malignant tumors. According to the statistics of gov/). The data of “Incidence-­SEER Research Plus Data, 18 regions,
the Surveillance, Epidemiology, and End Results (SEER) for 2011-­ Nov 2020 Sub (2000–­2018)” were obtained. The cases whose “Site
2017, the overall 5-­year survival rate of cervical cancer was 66.3%. recode ICD O3.WHO 2008” was filled with “Cervix Uteri” were
Although the 5-­year survival rate of localized cervical cancer was identified as cervical cancer patients. However, some patients en-
91.9%, that of regional and distant cervical cancer declined to 58.2% rolled in SEER did not receive the recommended therapies due to
and 17.6%, respectively.4 Despite clinical stages providing excellent patients' preferences or comorbidities, which might have shortened
predictions for the prognosis of cervical cancer patients, the progno- their survival, attenuating the performance of the model in which
sis is also significantly affected by age, histological type, and tumor they were included. According to National Comprehensive Cancer
5–­9
differentiation. Therefore, an additional clinical stage system that Network guidelines, cervical cancer patients should undergo sur-
takes into account the factors above is necessary. gery (local tumor excision such as cone biopsy, hysterectomy, modi-
Using the SEER cohort, several survival prediction models for fied hysterectomy, and radical hysterectomy) or radiotherapy, which
cervical cancer patients have been constructed. Li et al and Yu mainly depends on their clinical stage. Taking the treatment-­related
et al constructed two nomograms for predicting the prognosis of variables in SEER into account, patients who received surgery or ra-
cervical squamous cell carcinoma and newly diagnosed stage IVB diotherapy by SEER were defined as patients with standard treat-
patients, respectively.10,11 Other recent relevant studies focused on ment. Therefore, the cases that fulfilled the following criteria were
patients of specific stages or who received specific treatments.9,12–­15 eligible for this study: (a) cervical cancer patients with clear survival
However, current models mainly focus on subpopulations, and it may months and vital status; (b) patients who received surgery or ra-
be inconvenient for physicians to select the most suitable one from diotherapy. The cases that matched one of the following situations
a variety of models for a patient with some specific characteristics. were excluded: patients with unclear American Joint Committee on
However, not all patients in the SEER cohort received regular treat- Cancer/SEER stages and patients whose primary surgical sites or ra-
ments, which might decrease the performance of models based on diotherapy status were unclear. Additionally, our models aimed to
SEER. Therefore, models designed for the total population with reg- predict the survival status after 3 and 5 years, so the loss of follow-
ular treatments are usually more practical. Recently, machine learn- ­up cases was excluded, as well as survivors whose follow-­up time
ing approaches have drawn much research attention. Several studies was shorter than 36 and 60 months for the 3-­ and 5-­year models,
have constructed machine learning–­based models to predict survival respectively.
of cancer patients, and the results suggested better or at least equal
performance compared with nomograms.16–­19 As one of the most
popular machine-­learning algorithms, Bayesian networks (BNs) are 2.2  |  Feature selection and data transformation
directed acyclic graphs (DAGs) that are widely used in the healthcare
domain. Use of BN enables causal inference among the included fea- Features in the BN were initially selected with a combination of a pri-
tures, 20 and this makes it more attractive and suitable for survival ori knowledge and survival analysis of the cases in this study. First,
prediction. Differing from parametric statistical models, BNs do not we included the well-­proven prognosis-­related factors. However, no
require the data to follow a specific type of parametric distribution, null values were allowed in the construction of BN, so any variables
which is more useful in dealing with real-­world data. The points with more than 50% null values were excluded. To improve perfor-
above indicate the desirability of constructing a BN prediction model mance, data transformations were finished before constructing the
for cancer patient survival. In addition, the International Federation BNs. Both scattered and continuous variables were classified into
of Gynecologists and Obstetricians (FIGO) staging is the most com- three categories (eg, age at diagnosis). We combined categorical var-
monly used clinical staging for cervical cancer, which performs excel- iables into minimal numbers of categories. Then log-­rank tests and
lently in the prognosis of patients. However, its performance is not Cox regression were performed to explore the associations between
perfect because the prognosis can also be affected by other factors. the variables and overall survival of cervical cancer patients.
For example, older stage I patients with low-­differentiated tumors
may have shorter survival than younger stage II patients. Therefore,
the performance of FIGO staging may improve if the above factors 2.3  |  BN construction
were also taken into account.
This study aimed to construct 3-­and 5-­
year BN-­
based sur- A BN is a DAG that represents nodes (variables) and their condi-
vival prediction models for cervical cancer patients. We also ex- tional probabilities. The process of BN construction is also called
plored modified clinical stages based on FIGO staging using the BN learning, which includes structure learning (constructing the re-
predictions. lationships between the nodes in the network) and parameter
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
LIU et al.       1133

learning (calculating conditional probabilities between the nodes). 3  |  R E S U LT S


A hill-­climbing (HC) algorithm was adopted in structure learning.
One of BNs' main merits is that they enable using a priori knowl- 3.1  |  Cases and variables
edge, namely setting the acknowledged relationships between the
nodes in structure learning. For example, the stage significantly A total of 66,872 cases were identified with 52 variables and were
affected overall survival, so we set this relationship before struc- extracted. First, the TNM stage was summarized according to the
ture learning as “whitelist.” Some impossible relationships were stage-­
related variables, and then they were replaced with four
also known; for example, race could not be affected by stage or variables: T, N, M, and TNM stages. Duplicate variables as well as
grade. These relationships were defined as the “blacklist” and set cases with unknown stages or survival months were deleted. Nine
to prevent such associations in the model. In this study, we used variables representing the metastasis of tumors were excluded be-
the “bnlearn” package in the R statistical program to achieve HC cause they had too many null values due to not being enrolled until
learning based on a priori knowledge. 21 For parameter learning, we 2010. The T, N, M, and TNM stages were then excluded because
preferred the Bayes method to the maximum likelihood estimation we transferred them into a new variable “FIGO stage.” Vital status
method because its estimated parameters are smoother, making and survival months were deleted because we transferred them into
inference both easier and more robust. 22 Two BN models were a binary variable representing survival status as 36 and 60 months
constructed to predict the 3-­and 5-­year survival, respectively. for the 3-­and 5-­year models, respectively. The variable “diagnosed
To assess the survival probability of cervical cancer patients year” was excluded because it would not help predict the survival of
with specific characteristics, we calculated the survival probabil- patients diagnosed in the future. Of the patients, 5652 cases were
ities under the combination of each category of FIGO stage, his- excluded due to unclear clinical stages, 2872 were excluded due to
tological type, age at diagnosis, and tumor differentiation. These unclear survival months, and 5022 were excluded due to not receiv-
conditions were grouped using k-­m eans cluster analysis via the ing standard treatments. In addition, 9180 (for the 3-­year model)
survival probabilities. We tried to group the patients into the larg- and 15,396 (for the 5-­year model) patients were excluded due to an
est number of stages to better predict their prognosis. The larg- insufficiently long follow-­up time. Finally, two BNs with 11 variables
est number of stages whose survival curves remained separate and 43,749 (for the 3-­year model) and 39,333 cases (5-­year model)
from each other was deemed the optimal number of the modi- were identified. The flowchart of this section is shown in Figure 1.
fied stages. The range in optimal numbers of the modified stages
tested was 4-­12.
To validate the performance of our BN models, the dataset was 3.2  |  Demographic characteristics
divided using random sampling into training (containing 90% of ob-
servations) and test (10% of observations) sets. Both internal and Age at diagnosis was divided into three groups: young (under 40 years
external validations were adopted to validate the prediction models. old), middle (40-­65 years), and old (above 65 years). The races
Tenfold cross-­validations were used to perform internal validation. (from the variable “Race recode W B AI API”) were combined into
For the external validation, the models were validated using the test “Asians” (“Asian or Pacific Islander”), “Blacks,” “Whites,” and “Others”
set. We depicted the receiver-­operating characteristic (ROC) curves (“American Indian/Alaska Native” or “unknown”). Histological types
and calculated the areas under the curves (AUCs) as well as the max- were grouped into “adenomas or adenocarcinomas,” “squamous
imum accuracies of each model. The prediction models were con- cell neoplasms,” and “other” types. Marital status was grouped as
structed based on the training set. “Single,” “Married,” “Separated, Divorced, or Widowed (SDW).” The
details of the classifications are listed in Table S1 in supplementary
files.
2.4  |  Cluster analysis for modified stages The proportions of each subgroup's cases were similar be-
tween the 3-­and 5-­year models. The young, middle, and old age-­
To make better use of the BN models, we clustered the patients groups represented about 27%, 48%, and 25% for both models,
into 4-­12 modified stages via k-­means cluster analysis based on the respectively. About 36% of tumors were localized (IA1-­IIA 2), about
5-­year survival probabilities with different FIGO stages, ages, and 54% were regional (IIB1-­IVA), and the remaining 10% were at dis-
tumor differentiations. Considering that survival probability derived tant stage (IVB). White people accounted for most of the cervi-
from the BN might be unstable if too many conditions were speci- cal cancer patients (about 75%), followed by black people (about
fied, the histological types were excluded. The most suitable number 14%) and Asians (about 9%). The overall 3-­and 5-­year survival
of modified stages was determined by checking the Kaplan-­Meier rates were 73.2% and 64.4%, respectively. Chi-­s quare tests were
(K-­M) curves. The more the curves were distinguished from each performed to examine the potential survival rates in each sub-
other, the more suitable was the number of stages considered. population. Apart from widely accepted factors like stage, tumor
All statistical analyses and BN-­related procedures were com- differentiation, and age at diagnosis, we also found significant dif-
pleted with the R statistical program version 4.0.3 (http://cran.r-­ ferences in 3-­and 5-­year survival rates for race, marital status,
proje​c t.org/). and histological type. Black people had lower survival rates than
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1134      LIU et al.

F I G U R E 1  Flowchart of patient and


variable selection for building Bayesian
network (BN)-­based survival prediction
models

other races (3-­year survival: 63.4% vs. 74.8%, and 5-­year survival: being black was identified as a risk (HR = 1.19; 95% CI, 1.14-­1.24).
53.3% vs. 66.2%). Patients with adenomas or adenocarcinomas Interestingly, squamous cell carcinoma was also identified as a
also had higher survival rates than patients with squamous cell protective factor regarding adenocarcinoma. The Cox regression
neoplasms or other types of tumors. Details of the descriptive sta- results are shown in Table 2.
tistics are shown in Table 1.

3.4  |  Bayesian network


3.3  |  Survival analysis
The 3-­and 5-­year models shared the same structure (Figure  3).
Because survival time of patients diagnosed during different peri- Apart from the whitelist that was set, some plausible associations
ods might differ, we added the diagnosed year into log-­r ank tests were also obtained by the structure learning. For example, race af-
to avoid selection bias during constructing the BNs. No signifi- fected the age at diagnosis (node “Age”), as well as marital status
cant differences were found among the overall survival of patients (node “ms”). The age at diagnosis also influenced the clinical stage
diagnosed during different periods. The log-­r ank tests indicated (node “FIGO”). These associations accorded with current evidence,
that all the included variables were associated with overall sur- confirming the validity of the causal inference of BN.
vival (Figure 2). Cox regression confirmed that higher age, higher
stage, lower tumor differentiation, and single status were asso-
ciated with shorter overall survival. Concerning the impacts of 3.5  |  Usage of the BNs
race, being Asian was a significant protective factor (hazard ratio
[HR]  = 0.85; 95% confidence interval [CI], 0.80-­0 .90), whereas The detailed usage of our BNs is shown in supplementary files.
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
LIU et al.       1135

TA B L E 1  Descriptives of the variables that were finally included in the Bayesian network (BN)

3-­year model 5-­year model

Variables N Percentage % Survival rate % p value N Percentage % Survival rate % p value

ALL 43,749 –­ 73.20 –­ 39,333 –­ 64.40 –­


Age
Young 12,181 27.80 83.90 <0.001 10,790 27.40 78.67 <0.001
Middle 20,898 47.80 75.09 18,709 47.60 66.34
Old 10,670 24.40 57.40 9834 25 44.92
Race
White 33,111 75.70 74.61 <0.001 29,734 75.60 66.09 <0.001
Asian 3879 8.90 75.69 3399 8.60 66.43
Black 6125 14 63.44 5660 14.40 53.30
Other/unknown 634 1.40 80.76 540 1.40 72.59
Marital status
Married 19,330 44.20 79.24 <0.001 17,367 44.20 71.60 <0.001
Separated/Windowed/ 10,560 24.10 63.74 9727 24.70 53.38
Divorced
Single 11,767 26.90 71.16 10,412 26.50 61.90
Unknown 2092 4.80 77.25 1827 4.60 68.20
Tumor differentiation
Grade I 4348 9.90 89.26 <0.001 3713 9.40 83.68 <0.001
Grade II 13,385 30.60 76.14 11,852 30.10 67.06
Grade III 12,995 29.70 63.56 11,893 30.20 53.31
Grade IV 1015 2.30 57.54 937 2.40 45.68
Unknown grade 12,006 27.40 75.98 10,938 27.80 68.52
Histological types
Adenomas or 9114 20.80 80.67 <0.001 8010 20.40 72.61 <0.001
adenocarcinomas
Squamous cell 30,130 68.90 72.56 27,200 69.20 63.62
carcinoma
Others 4505 10.30 62.71 4123 10.50 53.29
Figo stage
IA 9603 22 97.22 <0.001 8389 21.30 94.70 <0.001
IB 12,401 28.30 89.61 10,914 27.70 83.32
IIA 1861 4.30 72.33 1660 4.20 59.28
IIB 4890 11.20 70.45 4471 11.40 58.56
IIIA 742 1.70 45.96 696 1.80 33.33
IIIB 2908 6.60 43.23 2756 7 32.44
IIIC 6206 14.20 62.99 5518 14 49.82
IVA 918 2.10 27.89 885 2.30 19.44
IVB 4220 9.60 24.55 4044 10.30 15.63
Surgeries
No 16,434 37.60 49.56 <0.001 15,197 38.60 37.85 <0.001
Local 5513 12.60 82.73 4832 12.30 74.69
Total 9927 22.70 89.40 8736 22.20 83.76
Radical 11,780 26.90 88.36 10,477 26.60 82.11
Unknown 95 0.20 50.53 91 0.20 40.66

(Continues)
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1136      LIU et al.

TA B L E 1  (Continued)

3-­year model 5-­year model

Variables N Percentage % Survival rate % p value N Percentage % Survival rate % p value

Radiation therapy
Yes 26,102 59.70 60.06 <0.001 23,822 60.60 48.53 <0.001
No/unknown 17,647 40.30 92.72 15,511 39.40 88.69
Chemical therapy
Yes 20,987 48 60.90 <0.001 18,942 48.20 48.73 <0.001
No/unknown 22,762 52 84.60 20,391 51.80 78.90

F I G U R E 2  Kaplan-­Meier (K-­M) curves of the variables included in the Bayesian network (BN)

3.6  |  Modified stages distinguished from each other more obviously (Figure 4). In addition,
the modified stages suggested the importance of more attention
After taking into account FIGO stage, age, and tumor differentia- to tumor differentiation and age when assessing overall survival of
tion, we clustered survival probabilities of patients into 4-­12 modi- cervical cancer. For example, most stage IA patients were grouped
fied stages via k-­means cluster analysis. The K-­M curves were most into the modified stage I, but the older stage IA patients with grade
distinguished from each other when the number of groups was set III tumors were grouped into the modified stage III; older stage IA
to nine (Table S2 and Figure 4), which we consider the most suitable patients with grade IV tumors were grouped into the modified stage
number of our modified stages. Compared with those grouped by VI (Table S2). Other K-­M curves of the modified stages are shown in
FIGO stage only, the K-­M curves of nine modified stages were also Figure S1 in supplementary files.
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
LIU et al.       1137

TA B L E 2  The results of Cox-­regression

Groups Variables Reference HR LL UL Sig

Age Middle Young 1.311 1.253 1.372 *


Old 2.082 1.982 2.186 *
Figo stage Stage IB Stage IA 2.326 2.144 2.523 *
Stage IIA 3.558 3.204 3.951 *
Stage IIB 3.69 3.364 4.048 *
Stage IIIA 6.286 5.579 7.082 *
Stage IIIB 6.938 6.31 7.628 *
Stage IIIC 6.041 5.497 6.639 *
Stage IVA 10.199 9.135 11.387 *
Stage IVB 14.136 12.909 15.479 *
Marital status Separated Married 1.266 1.219 1.314 *
Divorced
Widowed
Single 1.172 1.127 1.218 *
Unclear 1.093 1.013 1.18 *
Race RaceAsian Whites 0.849 0.803 0.897 *
RaceBlack 1.187 1.141 1.236 *
RaceO/UC 0.836 0.719 0.972 *
Tumor Grade II Grade I 1.245 1.158 1.339 *
differentiation Grade III 1.528 1.421 1.642 *
Grade IV 1.739 1.562 1.935 *
Unknown 1.212 1.126 1.304 *
Histological types Other types AC 1.248 1.178 1.321 *
SCC 0.887 0.849 0.926 *
Primary surgical site SurSLocal No surgeries 0.786 0.721 0.856 *
SurSRadical 0.597 0.551 0.647 *
SurSTotal 0.599 0.551 0.65 *
SurSUC 1.559 1.227 1.981 *
Radiation Implants & Beam Beam 0.689 0.665 0.714 *
Implants 0.729 0.681 0.781 *
No/Unknown 0.647 0.587 0.713 *
Radiation sequence Radiation before surgery Radiation after 1.339 1.224 1.464 *
Other/Unknown surgery 1.01 0.935 1.09
Chemical therapy Yes No 0.658 0.633 0.684 *
Lymph nodes Unclear No 0.975 0.825 1.152
examination Yes 0.669 0.251 1.785
Lymph nodes Not examined/unclear Negative 1.123 0.421 2.997
positive Positive 1.05 0.975 1.132

Abbreviation: AC, adenomas or adenocarcinomas; HR, hazard ratios; LL, lower limits; SCC, squamous cell carcinoma; UL, upper limits.
*p < 0.05.

3.7  |  BN model validation Cross-­validation indicated that the mean AUC was about 0.845 for
both models, with a maximum accuracy of about 0.81 for the 3-­year
Both 3-­and 5-­year survival prediction models were validated. There and 0.78 for the 5-­year model. These results indicated the robust
were only slight differences in performance between the two BNs and excellent performance of the BN prediction models. The ROCs
(Table 3). The AUCs were 0.855 (maximum accuracy = 0.804) for the including both external validation and cross-­validation are shown in
3-­year and 0.851 (maximum accuracy = 0.787) for the 5-­year model. Figure 5.
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1138      LIU et al.

F I G U R E 3  Structure of the Bayesian


network (BN) prediction model. Nodes:
Race, race; Age, age group; Figo,
International Federation of Gynecologists
and Obstetricians (FIGO) stages; Hist,
histological type; Grade, grading and
differentiation; ST, survival status after
3 or 5 years; ms, marital status; Chem,
chemotherapy; Rad, radiotherapy; SurS,
primary surgical site; Rsq, radiation
sequence

F I G U R E 4  Comparison of the Kaplan-­Meier (K-­M) curves between International Federation of Gynecologists and Obstetricians (FIGO)
stages and the modified stages
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
LIU et al.       1139

4  |   D I S C U S S I O N Medicaid and uninsured HPV-­related cervical cancer patients in the


United States had a higher risk of death.32 Similar findings were also
We constructed two BN models to predict the 3-­and 5-­year overall found for black people, suggesting that hospital quality might be an
survival probabilities of cervical cancer patients from SEER. Unlike important contributor to disparities in mortality among black and
most similar papers, the BN models we constructed were suitable for white patients.33,34 Therefore, this finding might add new evidence
all patients who received regular treatments and not just for some to attract more attention to black people with cervical cancer. In ad-
specific types of patients. However, we did not further compare the dition, marital status has frequently been reported as a prognostic
prognosis between patients with different combinations of treat- factor of survival of cervical cancer,9,35 and married patients were
ments, because of the intrinsic limitation of the treatment-­related mostly reported to have better prognoses of diseases, not limited
23
variables in SEER. Additionally, we used cluster analysis to restage to cancer.36,37 Previous studies proposed that married patients
the patients based on FIGO stage, age, and tumor differentiation, have better compliance with medical treatments due to perceived
and the modified stages performed better in predicting overall sur- spousal support, and they were less stressed, depressed, and anx-
vival of cervical cancer patients. ious after being diagnosed with cancer compared with unmarried
So far, there has been no consensus on the role of black ethnic- patients.38
24–­29
ity in poorer survival of cervical cancer, even after controlling This study used a machine learning–­based approach to predict
for treatment patterns.30,31 In this study, we selected patients who the 3-­and 5-­year overall survival probabilities of cervical cancer
underwent at least one type of radiation therapy or surgery, but patients. There were some plausible associations in our BNs, which
we still found that being black was associated with poorer over- were generated by structure learning. The age at diagnosis (node
all survival, and being Asian was a protective factor. One possible “Age” in Figure 3) could affect the clinical stage (node “FIGO”). Race
explanation might be socio-­economic status (SES), which we could (node “Race”) was suggested to affect age at diagnosis and marital
not identify using the current database. A recent study found that status (node “ms”). This might be due to the different compositions
of age groups and marital status among the races. Another interest-
TA B L E 3  Performances of the Bayesian network (BN) ing finding was that age at diagnosis could be affected by marital
External validation Internal cross-­validation status. We performed several chi-­square tests to further examine
the differences in proportions of older age–­group patients, which
Mean Mean maximum
showed that the proportion of this group was significantly lower in
Model AUC Accuracy AUC accuracy
married than in SDW and single patients (both p < 0.001). As men-
3-­year 0.855 0.804 0.845 0.805
tioned above, married patients tend to have better compliance,38
5-­year 0.851 0.787 0.849 0.780 which might also encourage women to participate in screening for
Abbreviation: AUC, area under the curve. cervical cancer.

F I G U R E 5  Receiver-­operating characteristic curves (ROCs) of both 3-­and 5-­year prediction models, including external validation (thick
lines) and 10-­fold cross-­validation (thin lines)
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1140      LIU et al.

The external and cross-­validation indicated good performance of REFERENCES


both models, with AUCs about 0.85 and maximum accuracies above 1. WHO. Cervical cancer. https://www.who.int/health-topic​s/cervi​
0.78. The performance of our models again suggested a promising cal-cance​r #tab=tab_1
2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A.
future for application of machine learning in the medical-­related
Global cancer statistics 2018: GLOBOCAN estimates of incidence
area. In addition, our BNs were suitable for all cervical cancer pa- and mortality worldwide for 36 cancers in 185 countries. CA Cancer
tients with standard treatments. The variables in our BNs were easy J Clin. 2018;68:394-­424.
to acquire and so are especially practical for middle-­and low-­income 3. World Health Organization. Global strategy to accelerate the elimina-
tion of cervical cancer as a public health problem. 2020. World Health
regions. Another significance of this study is the modified stages we
Organization.
clustered. Based on FIGO stages, we took age and tumor differen- 4. SEER. Cancer Stat Facts: Cervical Cancer. 2021.
tiation into account and modified them into nine new stages. The 5. Baalbergen A, Ewing-­Graham PC, Hop WC, Struijk P, Helmerhorst
newly modified nine stages slightly differed from the FIGO stages TJ. Prognostic factors in adenocarcinoma of the uterine cervix.
Gynecol Oncol. 2004;92:262-­267.
(Table S2), and both age at diagnosis and tumor differentiation could
6. Biewenga P, van der Velden J, Mol BW, et al. Prognostic model
considerably affect the prognosis, suggesting the necessity of con- for survival in patients with early stage cervical cancer. Cancer.
sidering these factors in predicting survival of cancer patients using 2011;117:768-­776.
clinical stages. For physicians who are not familiar with using the R 7. Cheung FY, Mang OW, Law SC. A population-­b ased analysis of
statistical program, it is much easier to use these modified stages in incidence, mortality, and stage-­ s pecific survival of cervical
cancer patients in Hong Kong: 1997-­ 2006. Hong Kong Med J.
estimating the survival of cervical cancer patients, which makes our
2011;17:89-­95.
findings more practical. 8. Vinh-­Hung V, Bourgain C, Vlastos G, et al. Prognostic value of his-
Nevertheless, the limitations should also be specified. First, we topathology and trends in cervical cancer: a SEER population study.
did not acquire individual SES-­related variables from SEER, and so BMC Cancer. 2007;7:164.
9. Wang C, Yang C, Wang W, et al. A prognostic nomogram for
the predictions might be biased. Second, the treatment-­related data
cervical cancer after surgery from SEER database. J Cancer.
in SEER might be biased due to not involving factors such as patient 2018;9:3923-­3928.
preference, physician recommendation, comorbidity, and proxim- 10. Yu W, Huang L, Zhong Z, Song T, Shou H. A nomogram-­based risk
ity to treatment providers. 23 Third, our BNs could only predict the classification system predicting the overall survival of patients
with newly diagnosed stage IVB cervix uteri carcinoma. Front Med.
probabilities for binary survival status, and quantitative survival
2021;8:693567.
periods are also preferred in practice. However, these limitations 11. Li Z, Lin Y, Cheng B, Zhang Q, Cai Y. Prognostic model for predict-
do not prevent contributions to cervical cancer–­related fields from ing overall and cancer-­specific survival among patients with cer-
using our BNs, considering their practicality and performance. vical squamous cell carcinoma: a SEER based study. Front Oncol.
2021;11:651975.
In conclusion, the constructed comprehensive BNs had excellent
12. Ni X, Ma X, Qiu J, Zhou S, Luo C. Development and validation
performance in predicting survival of cervical cancer. It is necessary of a novel nomogram to predict cancer-­s pecific survival in pa-
to consider age, histological type, and tumor differentiation when tients with uterine cervical adenocarcinoma. Ann Transl Med.
estimating the prognosis of cervical cancer using FIGO stages. 2021;9:293.
13. Yang J, Tian G, Pan Z, Zhao F, Lyu J. Nomograms for predicting the
survival rate for cervical cancer patients who undergo radiation
AC K N OW L E D G M E N T S therapy: a SEER analysis. Future Oncol. 2019;15:3033-­3 045.
This study was funded by the Doctoral Start-­
up Foundation 14. Feng Y, Wang Y, Xie Y, Wu S, Li M. Nomograms predicting the over-
of Liaoning Province (2021-­
BS-­
0 41) and the Natural Science all survival and cancer-­specific survival of patients with stage IIIC1
Foundation of Liaoning Province (2021-­YGJC-­15). cervical cancer. BMC Cancer. 2021;21:1-­11.
15. Zhang S, Wang X, Li Z, Wang W, Wang L. Score for the overall
survival probability of patients with first-­diagnosed distantly met-
D I S C LO S U R E astatic cervical cancer: a novel nomogram-­based risk assessment
The authors have no conflict of interest. system. Front Oncol. 2019;9:1106.
16. Madekivi V, Bostrom P, Karlsson A, Aaltonen R, Salminen E. Can
a machine-­learning model improve the prediction of nodal stage
E T H I C S S TAT E M E N T
after a positive sentinel lymph node biopsy in breast cancer? Acta
This study was approved by the Ethics Committee of the Cancer Oncol. 2020;59:689-­695.
Hospital of China Medical University. 17. Gennatas ED, Wu A, Braunstein SE, et al. Preoperative and post-
Approval of the research protocol by an Institutional Reviewer operative prediction of long-­term meningioma outcomes. PloS one.
2018;13:e0204161.
Board: N/A.
18. Zhao B, Gabriel RA, Vaida F, et al. Using machine learning to con-
Informed Consent: N/A. struct nomograms for patients with metastatic colon cancer.
Registry and the Registration No. of the study/trial: N/A. Colorectal Dis. 2020;22:914-­922.
Animal Studies: N/A. 19. Takada M, Sugimoto M, Masuda N, et al. Prediction of postopera-
tive disease-­free survival and brain metastasis for HER2-­positive
breast cancer patients treated with neoadjuvant chemotherapy
ORCID plus trastuzumab using a machine learning algorithm. Breast Cancer
Danbo Wang  https://orcid.org/0000-0002-8925-364X Res Treat. 2018;172:611-­618.
|

13497006, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cas.15624 by Nat Prov Indonesia, Wiley Online Library on [28/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
LIU et al.       1141

20. Yu J, Smith VA, Wang PP, Hartemink AJ, Jarvis ED. Advances to 32. Osazuwa-­Peters N, Simpson MC, Rohde RL, Challapalli SD, Massa
Bayesian network inference for generating causal networks from ST, Adjei BE. Differences in sociodemographic correlates of human
observational biological data. Bioinformatics. 2004;20:3594-­3603. papillomavirus-­associated cancer survival in the United States.
21. Scutari M. Learning Bayesian networks with the bnlearn R package. Cancer Control. 2021;28:10732748211041894.
arXiv preprint arXiv:09083817. 2009. 33. Jassal JS, Cramer JD. Explaining racial disparities in surgically

22. Nagarajan R, Scutari M, Lèbre S. Bayesian networks in R. Vol. 122. treated head and neck cancer. Laryngoscope. 2021;131:1053-­1059.
Springer; 2013:125-­127. 3 4. Rangrass G, Ghaferi AA, Dimick JB. Explaining racial disparities in
23. The Surveillance E, and End Results (SEER) Program SEER
outcomes after cardiac surgery: the role of hospital quality. JAMA
Acknowledgment of Treatment Data Limitations. 2020. Surg. 2014;149:223-­227.
24. Li R, Zhang Y, Ma B, Tan K, Lynn HS, Wu Z. Survival analysis of sec- 35. Wang M, Yuan B, Zhou ZH, Han WW. Clinicopathological char-
ond primary malignancies after cervical cancer using a competing acteristics and prognostic factors of cervical adenocarcinoma. Sci
risk model: implications for prevention and surveillance. Ann Transl Rep. 2021;11:7506.
Med. 2021;9:239. 36. Rendall MS, Weden MM, Favreault MM, Waldron H. The protective
25. Nogueira Rodrigues A, Melo AC, Alves FVG, et al. Lack of impact of effect of marriage for survival: a review and update. Demography.
race alone on cervical cancer survival in Brazil. Asian Pac J Cancer 2011;48:481-­506.
Prev. 2018;19:1209-­1214. 37. Robards J, Evandrou M, Falkingham J, Vlachantoni A. Marital sta-
26. Jalloul RJ, Sharma S, Tung CS, O'Donnell B, Ludwig M. Pattern of tus, health and mortality. Maturitas. 2012;73:295-­299.
care, health care disparities, and their impact on survival outcomes 38. Cohen SD, Sharma T, Acquaviva K, Peterson RA, Patel SS, Kimmel
in stage IVB cervical cancer: a nationwide retrospective cohort PL. Social support and chronic kidney disease: an update. Adv
study. Int J Gynecol Cancer. 2018;28:1003-­1012. Chronic Kidney Dis. 2007;14:335-­3 44.
27. Wu SG, Zhang WW, Sun JY, Li FY, He ZY, Zhou J. Multimodal treat-
ment including hysterectomy improves survival in patients with
locally advanced cervical cancer: a population-­based, propensity S U P P O R T I N G I N FO R M AT I O N
score-­matched analysis. Int J Surg. 2017;48:122-­127. Additional supporting information can be found online in the
28. Nishio S, Matsuo K, Yonemoto K, et al. Race and nodal disease sta- Supporting Information section at the end of this article.
tus are prognostic factors in patients with stage IVB cervical can-
cer. Oncotarget. 2018;9:32321-­32330.
29. Seamon LG, Tarrant RL, Fleming ST, et al. Cervical cancer survival
for patients referred to a tertiary care center in Kentucky. Gynecol
Oncol. 2011;123:565-­570.
3 0. Mundt AJ, Connell PP, Campbell T, Hwang JH, Rotmensch J,
How to cite this article: Liu G, Yang Z, Wang D. A Bayesian
Waggoner S. Race and clinical outcome in patients with carcinoma network predicting survival of cervical cancer patients—­
of the uterine cervix treated with radiation therapy. Gynecol Oncol.
Based on surveillance, epidemiology, and end results. Cancer
1998;71:151-­158.
Sci. 2023;114:1131-1141. doi:10.1111/cas.15624
31. Farley JH, Hines JF, Taylor RR, et al. Equal care ensures equal sur-
vival for African-­American women with cervical carcinoma. Cancer.
2001;91:869-­873.

You might also like