Professional Documents
Culture Documents
Learning in Endocrine
Neoplasms
Siddhi Ramesh, BAa,c, James M. Dolezal, MDa,c,
Alexander T. Pearson, MD, PhDa,b,c,*
KEYWORDS
Machine learning Deep learning Endocrine neoplasia Pathology Histology
Key points
Deep learning applications in histopathology have demonstrated the ability to enhance automation
to reduce pathologist workloads and to abstract image-related features that are indeterminable from
pure human inspection.
Centralized data repositories must be an emphasis to improve data access and improve the quality
and reliability of histopathology-related deep learning studies.
Standardized reporting and evaluation criteria must be established to improve study interpretability
and comparability to ultimately improve clinical adoption of deep learning models.
M
terest in developing new methods for data anal-
achine learning methods have been ysis. The use of artificial intelligence (AI) and
growing in prominence across all areas machine learning (ML) methodologies have
of medicine. In pathology, recent ad- continued to grow across many areas within med-
vances in deep learning (DL) have enabled compu- icine. Specific methodologies such as deep
tational analysis of histological samples, aiding in learning (DL) now enable the efficient analysis of
diagnosis and characterization in multiple disease complex data to solve issues with large-scale im-
areas. In cancer, and particularly endocrine can- age classification, free text natural language pro-
cer, DL approaches have been shown to be useful cessing, and signal processing.1
in tasks ranging from tumor grading to gene In medicine, diabetic retinopathy image inter-
expression prediction. This review summarizes pretation,2 electrocardiogram analysis,3 and
the current state of DL research in endocrine can- stroke or intracranial hemorrhage4 are a few exam-
cer histopathology with an emphasis on experi- ples of areas with FDA-approved AI tools. Within
mental design, significant findings, and key oncology, areas such as cancer radiology, radia-
limitations. tion oncology, gynecology-oncology, clinical
oncology, and pathology have seen increasing
OVERVIEW numbers of DL tools undergoing successful FDA
approval, with the largest number of DL tools
In recent years, an exponential increase in data seen in radiology and pathology applications.5
and computational infrastructure across all of However, thus far, these FDA-approved tools are
surgpath.theclinics.com
a
Department of Medicine, Section of Hematology/Oncology, University of Chicago Medical Center, 5841
South Maryland Avenue, MC 2115, Chicago, IL 60637, USA; b University of Chicago Comprehensive Cancer Cen-
ter, Chicago, IL, USA; c The University of Chicago Medicine & Biological Sciences, 5841 South Maryland Avenue,
Chicago, IL, USA
* Corresponding author.
E-mail address: apearson5@medicine.bsd.uchicago.edu
which function to receive multiple inputs from a The key purpose of ML when compared with
prior layer to create an understanding of proximity, traditional statistical approaches is the concept
which can often reduce network complexity in that a model can learn from examples, rather
cases where data proximity is important (eg, imag- than requiring explicit rules to be defined by a hu-
ing, text, speech). Beyond specific subclasses of man.8 During supervised learning, an ML algorithm
NNs, there are additional structural variations is trained to predict outcomes through exposure to
such as network depth. For example, shallow a large number of labeled data. Most often, these
NNs (contain less layers) are used to calculate an predictions are used for classification tasks in
input and output from one round of processing which the target variable is a discrete outcome
(eg, clinical scoring algorithms such as CURB- or in regression tasks where the target variable is
65).7 In contrast, deep NNs (contain more layers) a continuous outcome. Algorithms are provided
can represent more complex functions and under- input data, or features, and associated outputs,
stand underlying complexity of spatial data that re- known as labels (Fig. 2). For applications in pathol-
quires more nuance than classical clinical scoring ogy, the input data could be a digitized image of a
techniques.7 histopathology slide (the pixels of the image
converted into features) labeled with the correct In this section, we will describe the current state
diagnosis. The algorithm then learns to correlate of DL applications in endocrine cancer pathology
these features to the provided labels through a (Table 1), with an emphasis on experimental
process known as training. After being exposed design, findings, and key limitations. All articles
to a sufficient number and diversity of examples included were abstracted on March 14, 2022.
through training, the model’s ability to correctly
predict labels on a set of held-out test data is THYROID NEOPLASIA
formally assessed. Ideally, the test dataset should
be abstracted from a distinct source than the Thyroid cancers represent the most common can-
training data so that the model can be evaluated cer of the endocrine system. The majority of cases
on its ability to generalize to a completely novel within thyroid cancer are of the papillary thyroid
setting, ensuring that true underlying and abstract- carcinoma (PTC) subtype, accounting for 70% to
able biological features are driving predictions, 80% of overall cases,13 although other subtypes
rather underlying noise inherent to a particular include follicular thyroid carcinoma (FTC), medul-
dataset (see Fig. 2). lary thyroid carcinoma, and anaplastic thyroid car-
Conventional ML techniques, such as logistic cinoma. There has been some notable progress in
regression, support vector machine, random forest, CPATH applications in this domain, with applica-
and gradient boosting machines, are limited in their tions aimed for tumor identification,14–16 classifi-
ability to process large, complex datasets in unpro- cation,17–20 mutation prediction,20–23 and
cessed states.9 Constructing an effective pattern- segmentation13 from both cytopathologic and his-
recognition system requires significant manual tologic samples. Below, we briefly review a sam-
engineering of input data in order to allow the algo- pling of representative studies, summarizing
rithm to effectively extract patterns found aims, results, and limitations.
throughout the dataset.9 In comparison, DL A recent study by Lin and colleagues13 used a
methods are able to better learn complex, high- CNN for diagnosis and segmentation of PTC with
level features within large datasets and are being 131 fine needle aspiration and ThinPrep PTC sam-
increasingly investigated for medical applications ples digitized as whole slide images (WSIs). All
(eg, histopathology, medical imaging, physician PTC slides were annotated with ground truth an-
electronic medical record notes).10 notations by 2 pathologists with 28 slides used
for training and 103 slides used for testing. The im-
DEEP LEARNING IN ENDOCRINE PATHOLOGY ages were initially processed to remove back-
ground noise, and a VCG16-based CNN
One of the many areas DL applications are being architecture was developed to identify and
increasingly explored is histopathology. In current segment malignant PTC. The authors demon-
medical practice, trained pathologists interpret strated strong results, showing that their proposed
histopathology slides through visual inspection, method yielded an accuracy of 99%, precision of
identifying characteristics that allow for the 86%, and recall of 98%, outperforming existing ar-
assessment of a wide range of diseases, from can- chitectures.13 The study is limited largely by the
cer to inflammatory disorders. As medical knowl- training dataset in which all 131 samples were
edge of human disease continues to expand, PTC, making it difficult to assess how such a
there has been a growing increase in cases model might perform when presented benign
requiring pathological interpretation, increasing samples or samples belonging to a different sub-
the need for high-throughput review.11 type of thyroid malignancy.
The use of computational analysis to augment Dolezal and colleagues developed a regression
tissue sample analysis is known as computational model to predict BRAF-RAS gene expression
pathology, or CPATH.11 Early attempts at CPATH score (BRS) and used the predicted scores to
utilized feature engineering, or explicitly defined identify noninvasive follicular thyroid neoplasm
characteristics provided to a computer, such as with papillary-like nuclear features (NIFTP). NIFTP
cell size or shape, in an attempt to help automate is a diagnostically challenging subtype of follicular
pathologist analysis. However, modern CPATH thyroid neoplasms known for its high interobserver
projects often utilize DL techniques that obviate variability and benign clinical course. It has been
some manual analysis compared with other tradi- associated with RAS mutational profiles, whereas
tional approaches. Ultimately, DL methodologies papillary thyroid carcinoma with extensive follic-
can serve as a tool to augment pathologist work- ular growth (PTC-EFG) is known to be associated
flows, enabling more efficiency, while democra- with BRAFV600E mutations.20 Although not all tu-
tizing pathological analysis to areas that may not mors carry these associated mutations, the au-
have dedicated pathologist otherwise.12 thors hypothesized that nonmutant NIFTPs and
Deep Learning in Endocrine Neoplasms 171
Table 1
A survey of neural networks deployed in endocrine neoplasia studies
Number
of Slides/ Validation
Manuscript Disease Area(s) Task Method Samples Type of Data Strategy
Lin et al,13 Thyroid neoplasia Diagnosis CNN 131 Cytology Internal
2021 validation
Dolezal Thyroid neoplasia Gene CNN 115 Histology External
et al,20 expression validation
2021 prediction
Böhland Thyroid neoplasia Classification CNN 289 Histology External
et al,19 (subtype) validation
2021
Kriegsmann Pancreatic Diagnosis CNN 201 Histology Internal
et al,30 neoplasia validation
2021
Naito Pancreatic Diagnosis CNN 532 Histology Internal
et al,31 neoplasia validation
2021
Redemann Neuroendocrine Site of origin CNN 215 Histology Internal
et al,35 neoplasia prediction validation
2020
Govind Neuroendocrine Grading CNN 50 Histology Internal
et al,36 neoplasia validation
2020
Dum Adrenal neoplasia Immune CNN 9405 Histology Internal
et al,41 infiltration (IHC) validation
2022 prediction
follicular-patterned PTCs might still possess differ- demonstrate PTC-like nuclei. The study used 2
ences in BRAF-RAS spectrum gene expression datasets: the Tharun and Thompson dataset
that could be leveraged to train DL models to learn (156 samples divided into FTC, FA, NIFTP,
the histologic differences between these classes. FVPTC, and PTC) and the Nikiforov dataset (133
They trained a regression model on 386 slides samples divided into benign, FTC, classic PTC,
from The Cancer Genome Atlas to predict BRS, invasive FVPTC, and encapsulated FVPTC). The
and generated BRS predictions on an external feature-based classification model included using
dataset of 115 slides of classic PTC, PTC-EFG, pretrained model focused on identifying nuclei
NIFTP, and benign follicular adenoma (FA). The within the WSIs (trained on a nonthyroid dataset
authors found that the DL BRS predictions were externally), identifying features within the
highly associated with the NIFTP subtype in the segmented nuclei, and subsequently scoring
external dataset, with RAS-like BRS predictions each slide based on these abstracted features,
identifying NIFTP neoplasms with a sensitivity of using an aggregated score threshold to determine
97.9% a specificity of 96.6%. The study is limited a classification of PTC-like versus non-PTC-like.
by the utilization of pathologist-annotated regions The DL approach also leveraged a model pre-
of interest and a lack of ground-truth gene expres- trained on ImageNet.24 Ultimately, the results
sion scores on the external dataset. demonstrate that the feature-based approach
Böhland and colleagues19 explored DL applica- achieved an accuracy of 89.7% and 83.5% on
tions in PTC by comparing performance to the Tharun and Thompson and Nikiforov datasets,
feature-based classification ML methods to respectively, whereas the DL approach yielded
differentiate samples with and without “papillary results of 83.6% and 89.1%. The study is limited
thyroid carcinoma-like” nuclei. PTC-like nuclei by a limited inclusion of borderline datasets as
may be seen in NIFTP and follicular variants of well as a high-reliance on image preprocessing,
papillary thyroid carcinoma (FVPTC); however, in addition to a limited emphasis on results
FA and FTC are neoplastic subtypes that do not explainability.
172 Ramesh et al
focused on DL applications within the NEN dis- sample size and lack of true ground truth metrics
ease area,33–35 particularly over the last 3 years; for training and evaluation of model performance.
however, as seen in other disease areas, the ma- Additionally, there was no external dataset used
jority of studies focus on radiomics applications, for the evaluation of model performance making it
likely given the ubiquity of radiological datasets difficult to assess whether this approach can
relative to other data modalities such as histopa- perform effectively in novel settings.
thology.34 A subset of literature with notable find-
ings are highlighted below. PITUITARY, PARATHYROID, AND ADRENAL
Redemann and colleagues35 developed a DL NEOPLASIA
model to predict the site of origin for metastatic
well-differentiated NETs, given the task is difficult When compared with other endocrine neoplasms,
to consistently accomplish even with IHC tech- pituitary, parathyroid, and adrenal neoplasias are
niques. The study used 215 well-differentiated among the least prevalent.38,39 These neoplasms
NET hematoxylin and eosin-stained slides with a have been explored with DL applications using
known primary site. Of the overall sample, 130 radiomics and genomics data40; however, there
slides were used for training and 85 slides were are currently limited examples of studies using his-
used for testing. Compared with IHC (82% accu- topathology datasets. This may be due to the rela-
racy in site-of-origin prediction), the DL model tive scarcity of data in these domains, particularly
demonstrated an accuracy rate of 72%, ultimately given the relative rarity of malignant neoplasms of
demonstrating that the performance between IHC the pituitary, parathyroid, and adrenal glands.
gold-standard approaches and DL was not statis- Dum and colleagues41 evaluated 90 different tu-
tically significantly different. The study is limited by mor entities (including adrenocortical adenoma
an overall small sample size and the lack of a and pheochromocytoma) to assess the feasibility
discrete external validation set to evaluate model of a high-throughput analysis of lymphocyte sub-
generalizability. Furthermore, there was no populations by using an AI-supported multiple anti-
emphasis on an analysis of model explainability. body (cytotoxic T-lymphocyte associated protein 4
Nonetheless, the results do provide an initial [CTLA-4]) approach within multiple tumor subtypes.
proof-of-concept for DL approaches to be used The study used 2 different CTLA-4 antibodies due
in this disease area. to limitations with using a single antibody on
Another study by Govind and colleagues36 evalu- formalin-fixed tissue. The study incorporated 9405
ated the ability for a DL platform to improve the ac- images from the 90 tumor types used to train and
curacy of GI-NET grading. The study looked to validate a DL approach for detecting nonspecific
improve GI-NET grading by building on traditional staining. The digital images were analyzed using a
metrics such as the Ki-67 index. The authors used multistep approach, using a CNN (U-Net) for auto-
50 samples derived from various GI sites (8 stom- mated quantification of CTLA-41 cells and another
ach, 13 small bowel, 5 appendix, 3 colon, 16 rectum, deep NN (DeepLab31) for the detection of nonspe-
5 pancreas) with 2 samples discarded due to stain- cific (2) CTLA-4 staining. The results for the density
ing issues.37 The authors first developed an initial in- of CTLA-41 cells in the tumor categories identified
tegrated approach termed “Synaptophyin-Ki-67 clone-dependent unspecific staining pattern in ad-
Index Estimator” where the non-DL model was renal cortical adenoma (63%) for MSVA-152R and
trained to locate tumor cells, detect hot spots, and in pheochromocytoma (67%). The authors found
calculate the Ki-67 index. The WSIs were cropped that high CTLA-41 cell density was associated
into hot-spot-sized tiles and categorized into 4 with a low pT category, absent lymph node metas-
separate categories of background, nontumor, G1 tases, and PL-L1 expression in tumor or inflamma-
tumor, and G2 tumor, which served as ground truth tory cells. Overall, the study demonstrated the
for their subsequent DL approach. These tiles were ability for DL-assisted approaches to assist with im-
then used to train (42 cases; 15,232 image tiles) munostaining and identified potentially novel bio-
and test (6 cases; 9436 image tiles) their DL model. logical links between CTLA-4 lymphocytes and
The results demonstrated that when compared prognostic cancer features. The study was limited
with the gold-standard approaches, the study by sample sizes and potential issues with cross-
agreed with tumor grade in 45 out of 48 (93.8%) reactivity that may hinder reproducibility across all
cases and had a Ki-67 index error (difference be- tumor subtypes. Moreover, there was no external
tween GS index and estimated index) of dataset used for validation of this approach. Finally,
0.84 1.02%. The study demonstrated an inter- prior meta-analyses have indicated that there is no
esting methodology with fully automated “hot-spot significance in CTLA-4 expression and overall sur-
detection”; however, it remains constrained in its vival in multiple cancer subtypes, contradicting
evaluation. The study is limited overall by its low some of this study’s findings.42
174 Ramesh et al
2. Gulshan V, Peng L, Coram M, et al. Development thyroid tumor by histopathology: a large-scale pilot
and validation of a deep learning algorithm for study. Ann Transl Med 2019;7(18):468.
detection of diabetic retinopathy in retinal fundus 18. El-Hossiny AS, Al-Atabany W, Hassan O, et al. Clas-
photographs. JAMA 2016;316(22):2402–10. sification of Thyroid Carcinoma in Whole Slide Im-
3. Giudicessi JR, Schram M, Bos JM, et al. Artificial ages Using Cascaded CNN. IEEE Access 2021;9:
intelligence–enabled assessment of the heart 88429–38.
rate corrected qt interval using a mobile electro- 19. Böhland M, Tharun L, Scherr T, et al. Machine
cardiogram device. Circulation 2021;143(13): learning methods for automated classification of tu-
1274–86. mors with papillary thyroid carcinoma-like nuclei: A
4. Ratner M. FDA backs clinician-free AI imaging quantitative analysis. PLoS One 2021;16(9):
diagnostic tools. Nat Biotechnol 2018;36(8): e0257635.
673–4. 20. Dolezal JM, Trzcinska A, Liao CY, et al. Deep
5. Luchini C, Pea A, Scarpa A. Artificial intelligence in learning prediction of BRAF-RAS gene expression
oncology: current applications and future perspec- signature identifies noninvasive follicular thyroid
tives. Br J Cancer 2022;126(1):4–9. neoplasms with papillary-like nuclear features. Mod
6. Kleppe A, Skrede OJ, De Raedt S, et al. Designing Pathol 2021;34(5):862–74.
deep learning studies in cancer diagnostics. Nat 21. Anand D, Yashashwi K, Kumar N, et al. Weakly su-
Rev Cancer 2021;21(3):199–211. pervised learning on unannotated H&E-stained
7. Chary MA, Manini AF, Boyer EW, et al. The Role and slides predicts BRAF mutation in thyroid cancer
Promise of Artificial Intelligence in Medical Toxi- with high accuracy. J Pathol 2021;255(3):232–42.
cology. J Med Toxicol 2020;16(4):458–64. 22. Tsou P, Wu CJ. Mapping driver mutations to histo-
8. Rajkomar A, Dean J, Kohane I. Machine Learning in pathological subtypes in papillary thyroid carci-
Medicine. N Engl J Med 2019;380(14):1347–58. noma: applying a deep convolutional neural
9. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature network. J Clin Med 2019;8(10):1675.
2015;521(7553):436–44. 23. Fu Y, Jung AW, Torne RV, et al. Pan-cancer computa-
10. Esteva A, Robicquet A, Ramsundar B, et al. A guide tional histopathology reveals mutations, tumor compo-
to deep learning in healthcare. Nat Med 2019;25(1): sition and prognosis. Nat Cancer 2020;1(8):800–10.
24–9. 24. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Im-
11. van der Laak J, Litjens G, Ciompi F. Deep learning in ageNet: A large-scale hierarchical image database.
histopathology: the path to the clinic. Nat Med 2021; In: 2009 IEEE Conference on Computer Vision and
27(5):775–84. Pattern Recognition. ; 2009:248-255. doi:10.1109/
12. Ahmad Z, Rahim S, Zubair M, et al. Artificial intelli- CVPR.2009.5206848.
gence (AI) in medicine, current applications and 25. Cancer of the pancreas - cancer stat facts. SEER.
future role with special emphasis on its potential Available at: https://seer.cancer.gov/statfacts/html/
and promise in pathology: present and future pancreas.html. Accessed April 5, 2022.
impact, obstacles including costs and acceptance 26. Kenner B, Chari ST, Kelsen D, et al. Artificial intelli-
among pathologists, practical and philosophical gence and early detection of pancreatic cancer.
considerations. A comprehensive review. Diagn Pancreas 2021;50(3):251–79.
Pathol 2021;16(1):24. 27. Fu H, Mi W, Pan B, et al. Automatic pancreatic ductal
13. Lin YJ, Chao TK, Khalil MA, et al. Deep learning fast adenocarcinoma detection in whole slide images
screening approach on cytological whole slides for using deep convolutional neural networks. Front On-
thyroid cancer diagnosis. Cancers (Basel) 2021; col 2021;11:665929.
13(15):3891. 28. Wu W, Liu X, Hamilton RB, et al. Graph Convolutional
14. Dov D, Kovalsky SZ, Assaad S, et al. Weakly super- Neural Networks for Histological Classification of
vised instance learning for thyroid malignancy pre- Pancreatic Cancer 2022;28. https://doi.org/10.
diction from whole slide cytopathology images. 1101/2022.01.26.22269832, 2022.01.26.22269832.
Med Image Anal 2021;67:101814. 29. Chang YH, Thibault G, Madin O, et al. Deep learning
15. Elliott Range DD, Dov D, Kovalsky SZ, et al. Applica- based Nucleus Classification in pancreas histologi-
tion of a machine learning algorithm to predict ma- cal images. In: 2017 39th Annual International Con-
lignancy in thyroid cytopathology. Cancer ference of the IEEE Engineering in Medicine and
Cytopathology 2020;128(4):287–95. Biology Society (EMBC). ; 2017:672-675.
16. Sanyal P, Mukherjee T, Barui S, et al. Artificial intelli- doi:10.1109/EMBC.2017.8036914.
gence in cytopathology: A neural network to identify 30. Kriegsmann M, Kriegsmann K, Steinbuss G, et al.
papillary carcinoma on thyroid fine-needle aspira- Deep learning in pancreatic tissue: identification of
tion cytology smears. J Pathol Inform 2018;9(1):43. anatomical structures, pancreatic intraepithelial
17. Wang Y, Guan Q, Lao I, et al. Using deep convolu- neoplasia, and ductal adenocarcinoma. Int J Mol
tional neural networks for multi-classification of Sci 2021;22(10):5385.
176 Ramesh et al
31. Naito Y, Tsuneki M, Fukushima N, et al. A deep 40. Thomasian NM, Kamel IR, Bai HX. Machine intelli-
learning model to detect pancreatic ductal gence in non-invasive endocrine cancer diagnos-
adenocarcinoma on endoscopic ultrasound- tics. Nat Rev Endocrinol 2022;18(2):81–95.
guided fine-needle biopsy. Sci Rep 2021;11(1): 41. Dum D, Henke TLC, Mandelkow T, et al. Semi-auto-
8454. mated validation and quantification of CTLA-4 in 90
32. Dasari A, Shen C, Halperin D, et al. Trends in the different tumor entities using multiple antibodies and
incidence, prevalence, and survival outcomes in pa- artificial intelligence. Lab Invest 2022;1–8. https://
tients with neuroendocrine tumors in the United doi.org/10.1038/s41374-022-00728-4.
States. JAMA Oncol 2017;3(10):1335–42. 42. Hu P, Liu Q, Deng G, et al. The prognostic value of
33. Wallace PW, Conrad C, Brückmann S, et al. Metab- cytotoxic T-lymphocyte antigen 4 in cancers: a sys-
olomics, machine learning and immunohistochem- tematic review and meta-analysis. Sci Rep 2017;
istry to predict succinate dehydrogenase 7(1):42913.
mutational status in phaeochromocytomas and par- 43. Kochanny S, Pearson A. Academics as leaders in
agangliomas. J Pathol 2020;251(4):378–87. the cancer artificial intelligence revolution. Cancer
34. Pantelis AG, Panagopoulou PA, Lapatsanis DP. Arti- 2020;127. https://doi.org/10.1002/cncr.33284.
ficial intelligence and machine learning in the diag- 44. Sandfort V, Yan K, Pickhardt PJ, et al. Data augmen-
nosis and management of gastroenteropancreatic tation using generative adversarial networks (Cycle-
neuroendocrine neoplasms—a scoping review. Di- GAN) to improve generalizability in CT segmentation
agnostics 2022;12(4):874. tasks. Sci Rep 2019;9(1):16884.
35. Redemann J, Schultz FA, Martinez C, et al. 45. Miko1ajczyk A, Grochowski M. Data augmentation for
Comparing deep learning and immunohistochemistry improving deep learning in image classification prob-
in determining the site of origin for well-differentiated lem. In: 2018 International Interdisciplinary PhD
neuroendocrine tumors. J Pathol Inform 2020;11:32. Workshop (IIPhDW). ; 2018:117-122. doi:10.1109/
36. Govind D, Jen KY, Matsukuma K, et al. Improving the IIPHDW.2018.8388338.
accuracy of gastrointestinal neuroendocrine tumor 46. Wei J, Suriawinata A, Vaickus L, et al. Generative Im-
grading with deep learning. Sci Rep 2020;10(1):11064. age Translation for Data Augmentation in Colorectal
37. Matsukuma K, Olson KA, Gui D, et al. Synaptophy- Histopathology Images. Proc Mach Learn Res 2019;
sin-Ki67 double stain: a novel technique that im- 116:10–24.
proves interobserver agreement in the grading of 47. Howard FM, Dolezal J, Kochanny S, et al. The
well-differentiated gastrointestinal neuroendocrine impact of site-specific digital histology signatures
tumors. Mod Pathol 2017;30(4):620–9. on deep learning model accuracy and bias. Nat
38. Chen C, Hu Y, Lyu L, et al. Incidence, demo- Commun 2021;12(1):4423.
graphics, and survival of patients with primary pitu- 48. Collins GS, Dhiman P, Navarro CLA, et al. Protocol for
itary tumors: a SEER database study in 2004–2016. development of a reporting guideline (TRIPOD-AI)
Sci Rep 2021;11(1):15155. and risk of bias tool (PROBAST-AI) for diagnostic
39. Correa P, Chen VW. Endocrine gland cancer. Cancer and prognostic prediction model studies based on
1995;75(1 Suppl):338–52. artificial intelligence. BMJ Open 2021;11(7):e048008.