Professional Documents
Culture Documents
∗ Corresponding author
2
Figure 1: Translational informatics as depicted in Prof. Matthew Hotopf’s presentation.
EHR: Electronic Health Records. CRIS: The Clinical Record Interactive Search application
used at the Biomedical Research Centre and South London and Maudsley NHS Foundation
Trust for research purposes.
3
Figure 2: Examples of NLP workflow including speech recognition, from Adj/Prof Hanna
Suominen’s presentation. Adapted from the Introducing the Data61/CSIRO Natural Lan-
guage Processing Team (2015) slides by Adj/Prof Leif Hanlen, Adj/Prof Hanna Suominen,
Dr Gabriela Ferraro, and Dr Lizhen Qu.
that have been taken at the Institute of Psychiatry, Psychology and Neuro-
science (IoPPN), Kings College, London, in collaboration with the Department
50 of Computer Science, University of Sheffield, to develop NLP and information
extraction solutions for mental health research needs. The Clinical Record Inter-
active Search (CRIS) application, developed at the South London and Maudsley
mental health trust (SLaM) and Kings College London, was internationally one
of the first platforms for mental health records analysis, providing researchers
55 access to de-identified data from the full electronic health record within a ro-
bust governance framework [11, 12]. CRIS data have supported over 80 research
papers and a number of funded research projects since the development of the
platform in 20081 . Examples of clinical variables that have been addressed with
NLP methods are shown in Figure 3.
60 The session then continued with a specific, and challenging, clinical use case
of preventing and predicting suicidal behaviour. Dr. Gergö Hadlaczky (Karolin-
ska Institutet, Stockholm) gave a presentation to provide some historical per-
1 http://www.maudsleybrc.nihr.ac.uk/facilities/clinical-record-interactive-search-cris/
cris-publications/
4
Figure 3: Examples of NLP solutions for mental health documentation in CRIS, from Prof.
Robert Stewart’s presentation.
spectives as to how suicide risk and prediction have been defined and assessed,
and challenged these approaches by highlighting why there are problems with
65 risk assessment tools typically used in clinical practice. Prof. Enrique Baca-
Garcı́a (Fundación Jiménez Dı́az Hospital, Madrid) then described work he and
his colleagues have performed by using text and data mining approaches on
heterogeneous datasets to try to improve suicide risk prediction from different
perspectives, also including patient-generated data.
70 To initiate the afternoon discussion session, Dr. Rina Dutta (King’s College,
London), Dr. Sumithra Velupillai (KTH, Stockholm and King’s College, Lon-
don) and Dr. Johnny Downs (King’s College, London) described a use-case and
study on using NLP to identify adolescents with autism-spectrum disorders at
risk of suicidal behaviour [13]. This use-case specifically highlighted the prob-
75 lem of evaluation requirements from an NLP perspective versus complex clinical
5
outcomes in the mental health domain.2
The workshop included ample time for focused discussions. The discussions
were moderated and divided into groups of about 6 participants per group, led by
80 a senior clinician and a senior informatician/computer scientist. Furthermore,
the participants were encouraged to join groups where there was either no or
minimal representation from their own institutions, in order to foster novel
discussions and new perspectives.
Broad discussion themes were provided as guidance for each discussion ses-
85 sion, based on the presentations that were given prior to discussions. The dis-
cussion sessions were dynamic, and academically and clinically engaging. The
opportunity to discuss these evaluation and methodology issues with researchers
from the clinical as well as the informatics fields was highly appreciated by all
participants, in particular in relation to scientific presentations that highlighted
90 different aspects of these problems, as identified by a post-workshop question-
naire. Broadly, the focus of each discussion session was to identify current
needs, including barriers and challenges, as well as to define opportunities and
actionable recommendations for the future from both sides in terms of deliver-
ing meaningful and repeatable results, and how to best use NLP methodology
95 in clinical research settings. We outline the main points that were raised with
respect to needs below, and elaborate on these in more detail in subsequent
sections.
• Datasets
2A forthcoming publication outlines and challenges suicide risk detection research in more
detail.
6
annotations, resources for particular domains and specific clinical
research problems.
– Data quality:
105 Even when resources are made available, the quality of the data is
not always of sufficient standard, because of, for example, missing
data, missing documentation, or insufficient annotations.
– Influence of bias:
Information about potential inherent bias (e.g., in sample selection
110 or information bias) in existing as well as new resources need to be
made explicit and discussed in more detail. This issue is particu-
larly important for the NLP community, where bias issues have not
traditionally been addressed.
• Evaluation approaches
7
clinical researchers and vice versa. As a result of the discussions, an outline
of highlighted methodological aspects from a clinical as well as an informatics
perspective is provided in the following sections, to provide further insights and
guidance in current challenges and future opportunities.
135 Acknowledgements
8
References
9
185 [8] S. Dublin, E. Baldwin, R. L. Walker, L. M. Christensen, P. J. Haug, M. L.
Jackson, J. C. Nelson, J. Ferraro, D. Carrell, W. W. Chapman, Natural
Language Processing to identify pneumonia from radiology reports., Phar-
macoepidemiol Drug Saf 22 (8) (2013) 834–841. doi:10.1002/pds.3418.
10