You are on page 1of 42

Master in Biomedical Engineering

Medical Data Intelligent Analysis


AIDM
Introduction
Gema García Sáez
The healthcare data explosion
 Adoption of EMRs in the US for all public and private healthcare providers in 2014
 Marked the beginning of an exponential increase in patient health data

https://www.cycloneinteractive.com/cyclone/assets/File/digital-universe-healthcare-vertical-report-ar.pdf
Healthcare is becoming a data problem

https://www.cycloneinteractive.com/cyclone/assets/File/digital-universe-healthcare-vertical-report-ar.pdf
The healthcare data explosion

 In 2017, it was estimated that a single patient generated 80Mb aprox. of data per
year
 EMR, Medical images
 In 2025:
 Annual growth rate of data for healthcare will reach 36%
 Faster than what is projected for other massive industries (financial services, media,
entertainment)
 After Covid-19, probably the estimate is low, an explosion has been faced on:
 Telehealth utilization
 Contact tracing
 Outbreak tracking
 Virus testing
 Medical research
https://www.rbccm.com/en/gib/healthcare/episode/the_healthcare_data_explosion
The power of Healthcare Data
 Healthcare is becoming increasingly connected but also
increasingly complex
 Challenge and opportunity
 Study of trends in Healthcare:
“What has become very clear is that the
greatest force behind health trends is data”

“Whether it is health wearables or on-demand testing, better


hospital software or algorithms capable of catching disease
more effectively, rapid change is taking place because of
increased access to big data and advanced data analytics”

Stanford Medicine 2017 Health Trends Report


https://med.stanford.edu/content/dam/sm/sm-news/documents/StanfordMedicineHealthTrendsWhitePaper2017.pdf
Trends in Healthcare Data (1)
 Medical research: Access to new, diverse data and open datasets
 Drug discovery
 Clinical trials and research more efficient
 Daily life
 Wearable devices
 Online diagnostic tools Better informed and engaged patients

 Genetic sequencing services


 Health systems are investing in Machine learning
 As effective as or more effective than human diagnostics

https://med.stanford.edu/content/dam/sm/sm-news/documents/StanfordMedicineHealthTrendsWhitePaper2017.pdf
Trends in Healthcare Data (2)

 Ongoing care:
 Telemedicine and health apps
 See patients virtually for increased access and tailored care
 Prediction and prevention:
 Health data allow to build better patient profiles and predictive models to
more effectively anticipate, diagnose and treat disease
 Skills and training:
 Need of proper infrastructure and a data-literate clinical workforce, to take
advantage of health data
Sources of personal data in medicine
Family history

Medical Hospitalizations
prescriptions

Clinical data
Surgeries

Diagnosis
Healthcare costs,
insurances
Laboratory tests

Medical Images Genetic studies


Sources of clinical data
 Clinical practice – Electronic Medical Record
 Medical research
 Pharmacologic studies
 Registries of patients
 Public health – Registry of diseases
 eHealth and mHealth applications
 Medical devices
 Wearables
 IoT Devices
 Environmental data
 Social networks
Features of Healthcare data
 Distributed
 Different health departments, medical centers
 Different formats:
 Text, medical images, video, …
 Replicated data in different systems: inconsistencies?
 Ex. Primary care, Specialized care
 Structured vs. non-structured data
 Data generated by patients’ medical devices
 Continuous monitoring
 Monitoring at home
 Daily generation

 Complex data
 Privacy and security
Big Data Features
 Volume
 Scale of data

 Variety
 Different forms of data
 Biological, clinical, environmental, lifestyle
 From single individuals to large cohorts

 Velocity
 Analysis of streaming data

 Veracity*
 Quality and validity
 Uncertainty of data
Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity, and Variety
IBM. (2013). The Four V’s of Big Data.
Artificial Intelligence

Ability to make computers do things that would require


intelligence if done by humans*

 Systems associated to human intelligence, human


behaviour
 Understanding of language
 Learning
 Reasoning to make decisions
 Problem solving

*Boden M. Artificial Intelligence and Natural Man: Basic Books, Inc.; 1977
Features of Artificial Intelligence

 Resolution of reasoning problems characterized by a degree of


uncertainty
 Extract information from:
 Knowledge stored in data
 Past experiences
 Knowledge provided by experts

 Adjust its behavior according to the changing environment


Which are the
key areas and
biggest
opportunities
for AI in
Healthcare?

AI in Healthcare White paper-European Big Data Value Forum 2020


https://www.bdva.eu/sites/default/files/AI%20in%20Healthcare%20Whitepaper_November%202020_0.pdf
Impact of Health Data on Medical Research

 Access to vast amounts of data that did not exist 5


or 10 years ago
 Recent open data initiatives have increased the
types of publicly-accessible data
 Researchers can pose new questions, uncover new
findings and validate hypotheses
 Advancements in genomics and gene-sequencing
 Creation of large volumes of diverse datasets for
drug discovery
 Accelerate development of new products
Project Baseline
 The project is tracking 10,000 voluntary participants over 4 years
(2017-2020)
 Build a map of human health and disease
 Effort to gain a broader understanding of health
Project Baseline

 “Data starts to become valuable and interesting when we can bring it together to
get a holistic look at human health”
 “Genetic risk alone is not necessarily a predictor of developing a condition. When
viewed side-by-side with other risk factors, genetics can enable precision
medicine: a personalized care plan tailored to each person’s unique medical
history”

https://blog.projectbaseline.com/?_ga=2.73686935.489509056.1573224209-6386543.1573224209
All of us Research Program - NIH

 Gather data from 1 million or more people living in the United States
 Aim: accelerate research and improve health
 By taking into account individual differences in lifestyle, environment, and biology,
researchers will uncover paths toward delivering precision medicine
Real
World
Evidence
Joint Action Towards the European Health Data
Space – TEHDAS
Joint Action Supports the Members States and the Commission
Develops and promotes concepts for sharing of
data in secondary use
Participants Nominated authorities from 25 European countries

Duration 30 months from February 2021 – Summer 2023

Co-funding €4 million - EU 60%, Member States 40%


Joint Action Towards the European Health Data Space

Options for governance models


The needs, expectations
for data-sharing at EU level, in
and views of stakeholders Sustainability Governance particular networking for
on economic sustainability
secondary use

Secondary Availability of
Definitions, good practices comparable high-
and use cases of GDPR- Data altruism use of Data quality quality health data
compliant data altruism health for research and
innovation
data

Shared European data


Citizen perception of health Citizen
Infrastructure infrastructures across the
data and data-sharing engagement EU (cf. eHDSI)
practices
European Strategy for Data (2020)
A common European data space, a single market for data

Data can flow within the


EU and across sectors

Availability of high-quality data


to create and innovate

European rules and values


are fully respected

Rules for access and use of data are


fair, practical and clear & clear data
governance mechanisms are in place
EHDS proposal
1 Chapter 1: General provisions (Art. 1 -2)

2 Chapter 2: Primary use of electronic health data (Art. 3 - 13)

3 Chapter 3: EHR systems and wellness applications (Art. 14 - 32)

4 Chapter 4: Secondary use of electronic health data (Art. 33 - 58)

5 Chapter 5: Additional actions (Art. 59 - 63)

6 Chapter 6: European governance and coordination (Art. 64 - 66)

Chapter 7-9: Delegation, Miscellaneous, Deferred application


7-9 (Art. 67 - 72)
https://health.ec.europa.eu/ehealth-digital-health-and-
care/european-health-data-space_en
EHDEN: Vision and Mission

Vision
Values
The European Health Data & Evidence Network (EHDEN) aspires to be
the trusted observational research ecosystem to enable better health
decisions, outcomes and care

Mission
Our mission is to provide a new paradigm for the discovery and
analysis of health data in Europe, by building a large-scale, federated
network of data sources standardised to a common data model
OVERVIEW DATA PARTNER CALLS
Applications (n=405)
Awarded applications (n=166)

Spain
Italy
UK
Belgium
Portugal
Germany
The Netherlands
France
Finland
Croatia
Serbia
Hungary
Greece
Switzerland
Czech Republic
Turkey
Sweden
Norway
Israel
Bulgaria
Montenegro
Luxembourg
Ireland
Estonia
Denmark
Bosnia Herzegovina
Austria

Geographic spread of data partners. The shade of 0 4 8 12 16 20 24


blue indicates the # of data partners in that country # of Data partners
(darker = more)
Covid-19 open source datasets

COVID-19 open source data sets: a comprehensive survey. Applied intelligence 2020. https://link.springer.com/article/10.1007/s10489-020-01862-6
BigMedilytics
 Uses Big Data and AI technologies to improve the productivity of the Healthcare
sector
 12 pilots covering all major disease groups in the EU (prevention and diagnosis)

AI in Healthcare White paper-European Big Data Value Forum 2020


https://www.bdva.eu/sites/default/files/AI%20in%20Healthcare%20Whitepaper_November%202020_0.pdf
Artificial intelligence in the battle against
covid-19
 The crisis might have slowed the non- critical
investments in collecting data and
developing AI tools
 The development of AI solutions for chest
radiology have accelerated
 High adoption of technologies that support
remote care such as AI based technologies
 Chatbots
 Remote diagnosis
 Automatic risk assessment

*Thanh Thi Nguyen. Artificial intelligence in the battle against coronavirus (covid-19): a survey and future research directions..
Conclusions
 The Healthcare sector is changing from a volume to value-based
model
 It is essential to get a complete and accurate understanding of
treatment trajectories in specific patient populations
 Aggregate disparate data sources (granularity, quality, type of data)
 Allow better quality of care
 Allow access to healthcare at lower costs
 Provide first time right treatment and benefit the patient

Precision Health
Healthcare that is more preventive, predictive,
personalized and precise
Software Tools
Information about practical sessions in:
- Advanced visualization of Health Data
- Big Data in Health
Software Environment

 R Language & RStudio


 Notebooks
 Shiny
The “R” Language

R is a popular statistical programming language with a


number of extensions that support data processing and
machine learning tasks
Preferred language for statisticians, used by many data scientists
Very popular in biomedical research, bioinformatics,
mathematics, etc.

Features:
The most comprehensive collection of statistical models and
distributions
Allow interactive exploration and visualization of data
CRAN: Comprehensive R Archive Network
Large resource of open source statistical models
Software installation
1. Install R
 Open an internet browser and go to www.r-project.org
 Click the "download R" link in the middle of the page under "Getting Started"
 Select a CRAN location (a mirror site) and click the corresponding link (e.g.
RedIris in Spain: https://cran.rediris.es/)
 Click on the "Download R for …" link at the top of the page depending on your
Operating System
 Windows: Click on the "install R for the first time" link at the top of the page:
Download R 4.2.2 for Windows
 MAC depending on your OS version: R-4.2.2.pkg/ R-4.2.2-arm64.pkg

2. Install Software environment: RStudio


 Go to www.rstudio.com and click on the "Download" button
 Click on "Download RStudio Desktop“ (FREE)
 Click on the version recommended for your system and install it
Software installation
3. Install libraries (depending on the needs)
 Depending on the packages needed
install.packages(“library_name”)

 Recommended libraries to install


 install.packages("dplyr") # data manipulation
 install.packages(“ggplot2”)#visualization
 install.packages("rmarkdown”) #Notebooks
Cheat sheets
 Basic instructions to use the main libraries
Rstudio Cheat sheets
 https://rstudio.com/resources/cheatsheets/
Rstudio IDE
 https://raw.githubusercontent.com/rstudio/cheatsheets/main/rstudio-ide.pdf
Data Transformation with library dplyr
 https://raw.githubusercontent.com/rstudio/cheatsheets/main/data-
transformation.pdf
Data Visualization
 https://raw.githubusercontent.com/rstudio/cheatsheets/main/data-visualization.pdf
Shiny
 https://shiny.rstudio.com/images/shiny-cheatsheet.pdf
Example cheat sheet RStudio
Tools to report Data
Science
Notebooks
Notebooks to report Data Science results
 Interactive environment
to work and share your
code with others

 Main Notebooks With R:


 R Markdown
Popular option in the R
community to report on
data analyses
 Jupyter
Use a kernel prepared to
work with R in the
notebook environment
Markdown language
Markdown cells use a set of conventions for formatting
plain text
Bold and italic text:
 Surround italicized text with asterisks, like *italics*
 Surround bold text with two asterisks, like **bold**
Lists:
 Group lines into bullet points that begin with asterisks
*1)
*2)
*3)
Markdown language
 Headers (e.g., section titles)
 Place one or more hashtags (#)at the start of a line that will be a
header (or sub-header)
#
##
###
….
 Hyperlinks
Surround links with brackets
[Github](www.github.com)
R Markdown

 Workflow to create Rmd files


 See examples of Markdown code

https://www.markdownguide.org/cheat-sheet/
R Markdown

 When you click the **Knit** it is generated a document with the content of
the notebook
 Code
 Results
 Comments
 `echo = TRUE` -> print the R code in the document
 `include = TRUE` -> include the output in the document

``{r setup, include=FALSE}


knitr::opts_chunk$set(echo = TRUE)
```

You might also like