Professional Documents
Culture Documents
2021
1
{ttyagi, cmagdamo, anoori1, zli39, xliu61, mdeodhar, zhong1, wendong.ge, emye, ysheu, halabsi,
lnbrenner, grobbins, sfzafar, nbenson, lidia.moura, john.hsu, aserrano1, dprokopenko, dblacker,
rtanzi, bhyman, smukerji, mwestover, sdas5} @mgh.harvard.edu
Massachusetts General Hospital and McCance Center for Brain Health, Boston, MA
Abstract 1. Introduction
Dementia is a neurodegenerative dis- Dementia is the most common neurodegenerative dis-
order that causes cognitive decline and affects ease affecting older adults, progressing from mild
more than 50 million people worldwide. De- cognitive impairment (MCI) to mild, moderate,
mentia is underdiagnosed by healthcare pro-
and severe dementia. Dementia is underdiagnosed
fessionals — only one in four people who suf-
fer from dementia are diagnosed. Even when
(AlzheimerAssociation, 2021): dementia is not for-
a diagnosis is made, it may not be entered mally diagnosed or coded in claims data for over 50%
as a structured International Classification of of older adults living with probable dementia. Often,
Diseases (ICD) diagnosis code in a patient’s a diagnosis is given once patient has reached mod-
charts. Indeed, information relevant to cog- erate dementia, and irreversible damage has already
nitive impairment (CI) is often found within been done to the brain. The early detection of the
electronic health records (EHR) but manual re- first signs of cognitive impairment, however, is im-
view of clinician notes by experts is both time portant for improving clinical outcomes and patient
consuming and often prone to errors. Auto- management. Tools that can efficiently and effec-
mated mining of these notes presents an op- tively analyze medical records for warning signs of
portunity to label patients with cognitive im-
dementia and recommend patients for follow up with
pairment in EHR data. We developed natural
language processing (NLP) tools to identify pa-
a specialist can be critical to obtaining an early di-
tients with cognitive impairment and demon- agnosis for dementia. We aim to use NLP to detect
strate that linguistic context enhances perfor- signs of cognitive impairment from unstructured clin-
mance for the classification task. We fine-tuned ician notes by using deep learning techniques. Such a
our attention based deep learning model, which tool could also be useful in recruiting into clinical tri-
can learn from complex language structures, als as well as a variety of research studies. We apply
and substantially improved accuracy (0.93) rel- our deep learning algorithm to patients in Mass Gen-
ative to a baseline TF-IDF (term frequency- eral Brigham (MGB) Healthcare who have genotype
inverse document frequency) NLP model (0.84). data available from the MGB BioBank. An overview
Further, we show that deep learning NLP can of our project can be found in Appendix A.
successfully identify dementia patients without
dementia-related ICD codes or medications.
Keywords: EHR, NLP, Dementia 2. Related Works
Prior works have used NLP techniques to detect vari-
∗ Authors contributed equally ous diseases from EHR. (Rajkomar et al., 2018) used
© 2021 .
NLP Techniques to Detect Cognitive Impairment
2
NLP Techniques to Detect Cognitive Impairment
5. Results
We evaluated each model based on sequence level
class assignments. Model performance for each model
on the held-out test set are shown in Table 2. To com- ClinicalBERT, with its more complex architecture,
pute each metric, we used the threshold that max- was able to leverage the context of the keyword
imized accuracy. The TF-IDF model achieved an matches within the sequences and overcome these is-
AUC of 0.95 and accuracy of 0.84. The 20 words sues. Indeed, using a small dataset of manually an-
3
NLP Techniques to Detect Cognitive Impairment
4
NLP Techniques to Detect Cognitive Impairment
Emily Alsentzer, John Murphy, William Boag, Wei- Thilina Rajapakse. Simple transformers, 2020. URL
Hung Weng, Di Jindi, Tristan Naumann, and https://simpletransformers.ai/.
Matthew McDermott. Publicly available clini-
Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M
cal BERT embeddings. In Proceedings of the
Dai, Nissan Hajaj, Michaela Hardt, Peter J Liu,
2nd Clinical Natural Language Processing Work-
Xiaobing Liu, Jake Marcus, Mimi Sun, et al. Scal-
shop, pages 72–78, Minneapolis, Minnesota, USA,
able and accurate deep learning with electronic
June 2019. Association for Computational Linguis-
health records. NPJ Digital Medicine, 1(1):1–10,
tics. doi: 10.18653/v1/W19-1909. URL https:
2018.
//aclanthology.org/W19-1909.
Mohammed Saeed, Mauricio Villarroel, Andrew T
AlzheimerAssociation. 2021 alzheimer’s disease facts Reisner, Gari Clifford, Li-Wei Lehman, George
and figures. 2021. Moody, Thomas Heldt, Tin H Kyaw, Benjamin
Moody, and Roger G Mark. Multiparameter in-
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and
telligent monitoring in intensive care ii (mimic-ii):
Kristina Toutanova. Bert: Pre-training of deep
a public-access intensive care unit database. Crit-
bidirectional transformers for language under-
ical care medicine, 39(5):952, 2011.
standing. arXiv preprint arXiv:1810.04805, 2018.
Robert Tibshirani. Regression shrinkage and selec-
Benjamin S Glicksberg, Riccardo Miotto, Kipp W tion via the lasso. Journal of the Royal Statistical
Johnson, Khader Shameer, Li Li, Rong Chen, and Society: Series B (Methodological), 58(1):267–288,
Joel T Dudley. Automated disease cohort selec- 1996.
tion using word embeddings from electronic health
records. In PACIFIC SYMPOSIUM ON BIO- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob
COMPUTING 2018: Proceedings of the Pacific Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz
Symposium, pages 145–156. World Scientific, 2018. Kaiser, and Illia Polosukhin. Attention is all you
need. In Advances in neural information processing
Robert W Mahley and Stanley C Rall Jr. Apolipopro- systems, pages 5998–6008, 2017.
tein e: far more than a lipid transport protein. An-
nual review of genomics and human genetics, 1(1): Thomas Wolf, Lysandre Debut, Victor Sanh, Julien
507–537, 2000. Chaumond, Clement Delangue, Anthony Moi,
Pierric Cistac, Tim Rault, Rémi Louf, Morgan
Leland McInnes, John Healy, and James Melville. Funtowicz, Joe Davison, Sam Shleifer, Patrick von
Umap: Uniform manifold approximation and pro- Platen, Clara Ma, Yacine Jernite, Julien Plu, Can-
jection for dimension reduction. arXiv preprint wen Xu, Teven Le Scao, Sylvain Gugger, Mariama
arXiv:1802.03426, 2018. Drame, Quentin Lhoest, and Alexander M. Rush.
Huggingface’s transformers: State-of-the-art natu-
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S ral language processing. ArXiv, abs/1910.03771,
Corrado, and Jeff Dean. Distributed representa- 2019.
tions of words and phrases and their composition-
ality. In Advances in neural information processing
systems, pages 3111–3119, 2013.
5
NLP Techniques to Detect Cognitive Impairment
Appendix A. Overview
Figure 3: Overview
6
NLP Techniques to Detect Cognitive Impairment
Figure 4: Annotation UI
7
NLP Techniques to Detect Cognitive Impairment
8
NLP Techniques to Detect Cognitive Impairment
Appendix D. Keywords