You are on page 1of 6

WHITE PAPER

French Hospital Uses Trusted


Analytics Platform to Predict
Emergency Department Visits and
Hospital Admissions

Trusted Analytics Platform Overview


(TAP) is a collaborative
For hospital administrators, predicting the number of patient visits to emergency
environment for creating departments, along with their admission rates, is critical for optimizing resources at
advanced analytics and all levels of staff. Ultimately, this reduces wait times in emergency departments and
applications to help hospitals improves the quality of patient care.
improve patient care and
Intel and the Assistance Publique-Hôpitaux de Paris (AP-HP), the largest university
resource allocation. hospital in Europe, worked together to build a cloud-based solution for predicting
the expected number of patient visits and hospital admissions using advanced
data science methodologies and the Trusted Analytics Platform (TAP). TAP is
an open source platform that accelerates the creation of applications driven
by big data analytics. Using data from four emergency departments within AP-
HP, data scientists from Intel and medical experts from AP-HP evaluated three
different approaches to time series analytics, optimizing model parameters and
identifying the best predictive features to include in each. The team selected
Kyle Ambert an Autoregressive Integrated Moving Average with Exogenous Input (ARIMAX)
Data Scientist and Health Analytics approach that proved to be simultaneously accurate, scalable, and easily adaptable
Technical Lead, Intel Corp., PhD to the needs of both data scientists and hospital staff.
Sébastien Beaune The team moved into the model optimization phase of the project, using such
Emergency Department Director, AP-HP metrics as the Akaike Information Criterion, or AIC, to explore which features
Ambroise Paré., MD, PhD
to include in the model to balance accuracy and complexity. The team also
Adel Chaibi developed an Apache* Spark-based implementation of the ARIMAX algorithm
Application Developer, to take advantage of the speed and scalability of TAP’s distributed processing
Intel Corp. infrastructure.
Luc Briard
Data Scientist, Intel Corp., MBA The data science team then implemented the data model into TAP for efficient
cloud-based computation and to streamline workflows with application developers
Amit Bhattacharjee
who created a secure, browser-based user interface to interact with the results.
Data Scientist, Intel Corp., MSc
This prototype interface, which is undergoing testing at AP-HP hospitals,
Venkatesh Bharadwaj will enable hospital administrators to view 15-day predictions of emergency
Data Scientist, Intel Corp., MS
department visits and hospital admissions to optimize staffing levels based on
Krishna Sumanth anticipated needs. In the future, AP-HP plans to leverage TAP’s highly-scalable
Data Scientist, Intel Corp., MS environment to process large data sets to increase the accuracy predictions, and
Kathleen Crowe to investigate other healthcare challenges that could be addressed through the
Director of Data Science, Intel Corp., PhD analysis of big data.
French Hospital Uses Trusted Analytics Platform to Predict Emergency Department Visits and Hospital Admissions 2

Business Drivers AP-HP began discussions of how to use direction of France’s open data
its data to support development of an initiatives. The AP-HP project was
Overcrowding is a growing problem in analytics and machine learning solution developed at Teratec, a Paris-area
hospitals around the world. It results that could help predict emergency campus, founded by a consortium
in backups in the emergency room, department visits and hospital of government agencies, academic
with patients occasionally being turned admissions. They wished to incorporate institutions, and private companies,
away from emergency departments due additional data sources into a predictive including Intel.
to lack of space, beds, or staff to treat model with their own data, to evaluate
them. whether including external factors Data Wrangling and Preparation
would contribute to a greater accuracy
Predicting the utilization of their in visit- and admission-rate projections. France has strict data privacy laws,
emergency departments is a major AP-HP’s initial goal was to develop a which prevented AP-HP from providing
priority for administrators at AP-HP, web interface that would allow hospital Intel with its source data directly.
both as a means for reducing patient administrators to view predictions of
wait times and for improving patient future emergency department visits Because ETALAB had permission
care. Inability to accurately plan for and hospital admissions. A future goal to work with the protected health
future patient loads is a cause of was to offer a public-facing version of information contained in the raw data,
frustration and concern, as some the interface that would allow incoming it was responsible for managing the full
emergency rooms in AP-HP may be patients to view anticipated wait anonymization of the dataset, removing
overcrowded while others may be times at all of the network emergency any information that could lead to the
underutilized. In addition, patients may departments and to evaluate the identification of a patient. The dataset
not have sufficient information about medical specialties at various hospitals provided to Intel, which contained
specialty clinics within the network to before determining which AP-HP facility data on more than 470,000 patient
ensure they go to the most appropriate to visit. encounters, included the hospital
emergency department for specific site of the encounter, whether the
ailments or injuries. These challenges “The primary consideration in the patient was admitted to the emergency
make it difficult for administrators development of our data science department or later to the hospital,
to effectively allocate staff and other project with Intel was to improve and the date and times of patient
resources, because they are unable to patient care and outcomes,” said movements in or out of the emergency
accurately anticipate the need for beds Raphael Beaufret, the head of web department.
both in emergency departments and in innovation in the AP-HP CIO’s data
other parts of the hospital. department. “If patients had access to Given the limitations brought on by
accurate real-time data on wait times dealing with anonymized records, data
Emergency Departments with Rich at all AP-HP emergency departments, scientists from AP-HP and Intel were
and information about the availability restricted in the queries that could be
Datasets
of specialized care opportunities at performed on the data. However, by
In 2003, an intense heat wave hit various locations, some of them could including environmental data, such as
France, resulting in the death of make more informed decisions about weather and flu rates from publically
dozens of citizens, in part due to where they seek care.” available data sources, they could
inadequate emergency health facilities explore the usefulness of adding these
to deal with the crisis. The French Project Overview external data sources to provide more
government responded by making context to the patient information.
major investments in new emergency “We worked closely with AP-HP to
departments, including new electronic build a prototype cloud-based solution Phase One: Modeling
information systems. With a decade’s designed to predict the expected
number of patient visits and admissions After the full dataset was prepared
worth of both clinical and operational
over a moving 15-day window. and assembled by ETALAB, the project
data within its emergency department
This proof-of-concept (POC) used moved forward in two phases. The first
networks, AP-HP has a rich dataset
retrospective data to build a predictive involved the bulk of the data science
with the potential to use as a basis for
model of patient visits for four of the research, along with the initial model
meaningful big data analytics solutions
French hospitals in AP-HP,” said Kyle evaluations. During this phase, the Intel
for addressing a range of challenges in
Ambert, senior data scientist and and AP-HP teams cleaned and sampled
healthcare. In addition, clinicians and
technical lead for Health & Life Science, the data as well as surveyed a variety
data scientists at AP-HP were aware of
Intel Corp. of possible methods for predicting
recent studies suggesting correlations
hour-by-hour visits and admissions at
among such external factors as
Also contributing to the collaboration each site.
weather, holidays, and flu incidence and
increased emergency room visits. was ETALAB, a bureau of the French
government responsible for the Though the team’s primary goal at
this point was determining the time
French Hospital Uses Trusted Analytics Platform to Predict Emergency Department Visits and Hospital Admissions 3

series algorithm that most accurately Solution Details flu rates, seasonality, and the daily
tracked the peaks and valleys of the maximum and minimum temperatures
observed data from AP-HP, there The data science team evaluated the at Charles De Gaulle airport. Notably,
were also important secondary final dataset and the ARIMAX model models using holiday information
goals. The selected analytical model with respect to the AIC, a metric for did not perform as well, indicating
would need to scale for the analysis quantifying the tradeoff between that these data were not predictive
of much larger data sets than those model accuracy and model complexity. for emergency department visits and
used in the POC, and easily adapt to a Generally speaking, data analysts hospitalization for this population.
distributed computing framework. The aim to find a model that leads to the
model would also need to be readily greatest accuracy for a given set of To illustrate the contribution that a
interpretable to a non-statistical group data. However, to focus simply on single feature such as seasonality
of users, namely clinicians. increasing the model’s accuracy by could make in the model, the team
adding an abundance of predictive compared the predicted and actual
The team evaluated three time-series variables or over-training the model admission rates for the model with and
analysis approaches for deployment for a data subset can lead to overly without seasonality. Additionally, this
with the AP-HP data. For each method, complex models that aren’t able to comparison enabled visualization of
the team optimized model parameters, generalize to new data. AIC quantifies the predicted and observed admission
as well as identified the best predictive the accuracy of a model for a given rates (see Figure 2).The models
features to include in each. set of training data, along with the generally followed the patterns of the
likelihood that the model will generalize observed data well, with an overall
Phase Two: Selecting and Training to unseen data. variance of around 5 percent. However,
the Model the accuracy of the model speaks to
The team performed a traditional the inherent difficulty of predicting
After evaluating the three time series model comparison of the different actualities that are triggered by
analysis models, the data science team parameterizations for ARIMAX to accidents and unforeseen events.
from Intel chose the ARIMAX approach, identify and fix the correct parameter
based on its ability to address the settings to minimize the objective The data science team deployed a
requirements for accuracy, scalability, function (the AIC), then evaluated distributed implementation of the
and versatility. The team then tuned the the external feature combinations to ARIMAX time series algorithm on TAP,
model for production on TAP. similarly minimize AIC. By running to take advantage of TAP’s capacity for
successive tests and adding and massive scalability and fast processing.
ARIMAX is an algorithm used frequently removing variables, the team arrived The ARIMAX algorithm utilizes the
in the financial sector, where it’s at a model that yielded the greatest Apache Spark Resilient Distributed
employed for predicting changes in accuracy with the least complexity. Datasets (RDD) paradigm, which keeps
stock, commodity, and other financial RDDs in persistent memory for iterative
market value. It provides a standard The results of the model exploration processing to minimize the amount of
approach for expressing a linear phase, using the AIC metric, indicated time spent moving data into memory
regression model in which predictive that the best-performing models (see for repeated computations. The data
variables are used to predict some Figure 1 – a lower AIC number is better) is distributed to different compute
future outcome. ARIMAX takes into did not utilize all the explanatory nodes in the cluster for processing,
account previously-observed values variables that could be logically then reduced back into memory on
and incorporates them into the expected to be useful as predictors. the control node, with no need to
predictive model. The optimal combination of features for reengineer the code for scalability as
this dataset were the estimated number data size increases.
With an optimized model and a of visits, a weekend indicator variable,
final dataset (which included the
anonymized data from AP-HP, as well
as a number of external, open datasets,
including weekly reported flu cases
throughout France, maximum and
minimum daily temperatures recorded
at Charles De Gaulle airport, and holiday
and post-holiday indicator variables),
the team was ready to deploy the model
on TAP where big data analysis could
quickly be performed and results easily
shared with application developers. Figure 1. Impact of the combination of datasets within AIC (lower AIC number is better).
French Hospital Uses Trusted Analytics Platform to Predict Emergency Department Visits and Hospital Admissions 4

TAP will ensure that the optimized


model will scale to support datasets far
larger than the current data (derived
from four hospitals), as the ultimate
goal is to analyze data from all 44
locations. Not only is TAP designed
for fast processing of large datasets
and for massive scalability, but it also
includes ability to incorporate new
algorithms, tools, and custom features
as requirements evolve.

Contributions to the Open Source Figure 2. Admission predictions for a model with (red) and without (blue) the seasonality
Community feature, compared with the actual observed rates (black).

Another outcome of the team’s work on


the AP-HP project was its contribution
of the fully-developed ARIMAX time
series algorithm to spark-ts, a library
for time series analysis on Apache
Spark, to the open source community.

Browser-Based Application

In TAP, the data science team built


a browser-based user interface for
hospital administrators, physicians
and medical staff to easily visualize
forecasted near-term hospital patient
loads and plan resource allocation. Figure 3. The blue line represents the number of visitors to one of the AP-HP emergency
departments, while the orange line represents admissions to the hospital from the
emergency department at that location. To the left of the vertical red dotted line, which
In its current implementation, the
represents the present day, the data points reflect reported data while the dotted lines to
application enables hospital employees the right are predicted levels of utilization.
to view the past 15 days of reported
data for emergency department visits
and hospital admissions, and to view
predicted visits and admissions for the
subsequent 15 days.

Because the application was also


developed in TAP, data scientists and
application developers, from Intel and
AP-HP were able to work together
closely during the development. Figures
3 and 4 illustrate the application user
interface.

TAP Benfits with the AP-HP Proof-


of Concept Figure 4. The blue line represents the number of visitors to one of the AP-HP emergency
departments, while the orange line represents admissions to the hospital from the
emergency department at that location. To the left of the vertical red dotted line, which
For the Intel and AP-HP analytics team,
represents the present day, the data points reflect reported data while the dotted lines to
scalability was one of the primary the right are predicted levels of utilization.
benefits of using TAP. Once the code
and the application were optimized,
the solution could scale to process
much higher volumes of data. This was
particularly important when running
successive time series models to
French Hospital Uses Trusted Analytics Platform to Predict Emergency Department Visits and Hospital Admissions 5

provide data for analytical predictions. Trusted Analytics Platform


Forecasting the number of admissions
to the hospital for any given day relies
TAP is open source software optimized for performance and
on input from the prediction of that security that accelerates the creation of applications driven by
same day’s number of visitors to the big data analytics. TAP makes it easier for application developers
emergency room. TAP’s ability to scale and data scientists — at enterprises, cloud service providers,
for efficiently running these large and
complex time series models is important
and system integrators — to collaborate by providing a shared,
for the effectiveness of the AP-HP flexible environment for advanced analytics in public and private
solution. TAP’s scalability will also be clouds. Data scientists get extensible tools, scalable machine
key to the solution’s ability to efficiently learning algorithms, and powerful engines to train and deploy
process datasets from AP-HP’s larger
data lake, fed by its 44 member
predictive models. Application developers get consistent APIs,
hospitals. services and runtimes to help integrate these models into
applications quickly. System operators get an integrated stack
Although AP-HP data provided to the
Intel team had been anonymized, the
that they can easily provision in a cloud infrastructure.
TAP platform is optimized for security
to ensure that data and processing are TAP helps lower development costs and reduces time to
protected from end to end. Had AP-HP deploy analytics applications for organizations that want to
run its data on an internal cloud-based
TAP environment and used its own data
create custom solutions using big data analytics. Tested by data
scientists to analyze the data, TAP’s scientists in various industries, TAP uniquely provides a complete
security features would have satisfied pipeline for graph analysis as well as scalable algorithms and in-
French privacy mandates and de- memory engines for machine learning. TAP delivers open source
identification of the data would not have
been necessary. software as an integrated platform with hardware enhanced
performance and security in every layer.
Statistical experimentation on models
and algorithms formed a large part
of the initial research into the AP-
HP solution. TAP’s collaborative
environment for advanced analytics
facilitated the iterative back-and forth
between teams of data scientists that
enabled faster model development and
ensured they performed as intended.
In addition, the capabilities of TAP
extend into solution development,
enabling data scientists and application
developers to work together closely in a
shared environment to iterate both data
and application formats for delivery of
an optimal solution.

Summary

“We undertook a joint project, with


AP-HP, to research and implement a
big data analytics-based solution for
predicting emergency room visits and
hospital admissions in a university
hospital in France. Though the project
is still in its early phases, the Intel
and AP-HP data science teams have
succeeded in creating a POC that
met AP-HP’s goals for the project
and points to further opportunities Figure 5. The anatomy of TAP.
French Hospital Uses Trusted Analytics Platform to Predict Emergency Department Visits and Hospital Admissions 6

to collaborate in advancing the role or even predicting these flows—is To learn more, visit:
of data analytics in healthcare,” said absolutely key if we want to improve
Ambert. our quality of care.” Intel Big Data & Analytics:
https://software.intel.com/bigdata
“In addition to helping specific The Intel and AP-HP collaboration
hospitals organize and allocate successfully created a merged dataset Trusted Analytics Platform:
resources to avoid overcrowding and that incorporated de-identified
http://trustedanalytics.org
other emergency room challenges, data from four AP-HP hospitals
I see a number of other uses for this with exogenous public datasets that AP-HP:
solution,” said Dr. Saïk URIENS, Unité described external conditions in France
de Recherche Clinique et CIC Paris that might relate to hospital visit and http://www.aphp.fr
Descartes Necker Cochin, APHP, admission rates.
Research Director at Inserm. “Moving
forward, our hospital administrators The data science team evaluated
and medical staff should be able a number of time series analysis
to use this predictive analysis, on algorithms to determine which
retrospective big data, not only to approach optimized requirements
estimate the number of emergency for accuracy, scalability, and
department visits (including admissions interpretability by non-data science
rate), but also to categorize these users. The team chose the time
visits. For instance, it would be very series analysis method ARIMAX and
helpful to have predictive data that deployed it on TAP, a collaborative,
showed the proportion of children, flexible environment for developing
adults and elderly persons or men advanced analytics solutions. The
and women, as well as type of medical TAP analytics and development
causes (infection, cardiology, surgery, environment was also instrumental in
traumatology, etc.).” creating a prototype browser interface
that displays predicted patient visits
“Seeing the prediction application take and admission rates. The result was
advantage of all the data and provide an application that makes it simple for
useful and actionable insights has hospital administrators and clinicians
allowed our medical staff to imagine to better anticipate staffing levels and
the tremendous benefit it will provide improve allocation of resources, all with
to both the staff and our patients,” the end goal of improving care delivery
said Dr. Sébastien Beaune, emergency for patients within the AP-HP hospital
department director at AP-HP. “Having network.
a better understanding of patient flows
at our emergency departments—

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configution
No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com. Copies of documents which have an order number and are
referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725 or by visiting Intel’s website at http://www.intel.com/design/literature.htm.
Intel is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2016 Intel Corporation. All rights reserved. Printed in USA 0​516/​MD​/ MCB/PDF Please Recycle 334409-001US

You might also like