Big Data Requirements For Artificial Intelligence - Review

REVIEW
Big data requirements for artificial

COURRENTPINION
intelligence
Sophia Y. Wanga, Suzann Pershinga, Aaron Y. Leeb, on behalf of the AAO
Taskforce on AI and AAO Medical Information Technology Committee
Purpose of review
Downloadedfrom http://journals.lww.com/co-ophthalmology
To summarize how big data and artificial intelligence technologies have evolved, their current state, and next steps
to enable future generations of artificial intelligence for ophthalmology.
Recent findings
Big data in health care is ever increasing in volume and variety, enabled by the widespread adoption of electronic
health records (EHRs) and standards for health data information exchange, such as Digital Imaging and
Communications in Medicine and Fast Healthcare Interoperability Resources. Simultaneously, the development of
powerful cloud-based storage and computing architectures supports a fertile environment for big data and
artificial intelligence in health care. The high volume and velocity of imaging and structured data in ophthalmology
by BhDMf5ePHKbH4TTImqenVCscuGFl+NVZjo9UKqT7VeZDvydHgC8yFEiNei3xAdoW
and is one of the reasons why ophthalmology is at the forefront of artificial intelligence research. Still needed are
consensus labeling conventions for performing supervised learning on big data, promotion of data sharing and
reuse, standards for sharing artificial intelligence model architectures, and access to artificial intelligence models
through open application program interfaces (APIs).
Summary
Future requirements for big data and artificial intelligence include fostering reproducible science, continuing open
innovation, and supporting the clinical use of artificial intelligence by promoting standards for data labels, data
sharing, artificial intelligence model architecture sharing, and accessible code and APIs.
Keywords
artificial intelligence, big data, deep learning, informatics, machine learning
INTRODUCTION and by variety, which in health care often implies

on 08/17/2020
‘Big Data’ is a term often informally used in health care multimodal data: a combination of images, text, and
to describe any dataset bigger than a few thousand data structured fields [1].
points. However, the definition of big data is not guided The volume, velocity, and variety of big data in
by sample size cutoffs; rather, the scale and the challenge health care combined with the recent leaps in data storage
of ‘big data’ in the modern age is that datasets are now so and processing technologies have created a fertile
large that storage or environment for the development of artificial intelligence
processingofthedatastrainsthecomputationalcapacityofa models to perform a wide variety of
single computer and require more specialized computing
solutions. In health care, big datasets may have many www.co-ophthalmology.com
rows comprising the records of millions of patients,
and/or have many columns classification and prediction tasks. Ophthalmology in
correspondingtothemultiplefeaturescapturedforeachpatien particular has many quantitative structured data fields,
t. Furthermore, besides the sheer volume of data, big such as visual acuity and intraocular pressure, meshed
dataisoftencharacterizedbyvelocity(speedofacquisition) with structural imaging and functional measurements,
Volume 31 Number 5 September 2020

leading to an era where ophthalmology is at the forefront
of artificial intelligence in medicine [2]. In the present
review, we summarize how big data and associated
technologies have evolved to this critical juncture,
describe the current frontier of standardsand technologies
for big datain
a Department of Ophthalmology, Byers Eye Institute, Stanford University,

Palo Alto, California and bDepartment of Ophthalmology, University of
Washington, Seattle, Washington, USA
Correspondence to Aaron Y. Lee, MD, MSCI, Department of
Ophthalmology, University of Washington, 325 Ninth Ave, Box 359608,
Seattle, WA 98104-2499, USA. E-mail: leeay@uw.edu Curr Opin
Ophthalmol 2020, 31:318–323
DOI:10.1097/ICU.0000000000000676
This is an open access article distributed under the terms of the Creative
Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-
NC-ND), where it is permissible to download and share the work
provided it is properly cited. The work cannot be changed in any way or
used commercially without permission from the journal.

Big data requirements for artificial intelligence Wang et al.
Imaging and Communications in Medicine (DICOM),
established in 1993, is the premier standard for radiology and
KEY POINTS ophthalmology imaging [12]. Standards for EHR data have
Big data is often characterized by volume (size), velocity undergone significant evolution through different versions of
(speed of acquisition), and variety, which, in health care, the Health Level-7 guidance and the development of the new
often implies data which are multimodal: a combination of Fast Healthcare Interoperability Resources (FHIR): the next-
images, text, and structured fields.
generation standards framework for EHR data which
Increasing use of EHRs and the simultaneous development supports the latest web service technologies [13]. FHIR
and widespread adoption of standards for health data stores data into modular components called resources
exchange, such as Digital Imaging and Communications in designed to be simultaneously flexible, interoperable, and
Medicine for imaging and Fast Healthcare Interoperability easy to implement. These standard data exchange formats
Resources for EHR, have enabled the aggregation of health
enable data to be aggregated from different sources into ever
data from multiple sources, creating a rich new
environment for big data and artificial intelligence. larger datasets, which, in turn, enable the development of
powerful artificial intelligence predictive algorithms that can
Conducting reproducible research in big data for artificial be validated across different settings, for example, predicting
intelligence is critically reliant on the labels on which hospitalization mortality, length-of-stay, and readmissions
artificial intelligence models are trained, the underlying
structure and characteristics of the data, and across different healthcare systems using a common deep-
the details of the artificial intelligence model learning approach [14 ].&&
architecture. Training an artificial intelligence model on such

Consensus is needed on definitions for labeling data,
large scale data requires intensive computing resources.
standards for sharing and reusing data, the sharing of code Options now extend beyond utilizing local
specifying artificial intelligence models, and adoption of onpremisesupercomputinghardwaretoutilizingcloudbased
open APIs to artificial intelligence models. computing and storage infrastructures, many of which
include security measures enabling the storage and
To foster reproducible science, common data labels, data
sharing, and openness in artificial intelligence models are processing of protected health information in a manner
also critically important to promote the compliant with the Health
widest possible benefits from artificial intelligence InsurancePortabilityandAccountabilityAct(HIPAA) [15].
research. These cloud-based solutions offer the ability to house and
analyze data on a large scale, enabling the
abilitytotrainpowerfulartificialintelligencemodels. Once a
artificial intelligence, and discuss what is needed for big data model is trained, the process of using that model to
to enable the next generation of artificial intelligence. generate predictions is called inference. Inference
hardware is relatively inexpensive and becoming
Current capabilities enabled by data standards and ubiquitous in some places: smartphones’ voice-activated
artificial intelligence hardware technologies assistants use inference to recognize commands, as do
image search, spam filtering, and product
Previously, some of the biggest data in ophthalmology recommendation applications. However, in hospitals and
research came from national government-sponsored studies healthcare settings, this type of
[3], insurance claims [4], large clinical trials [5,6], cohort infrastructureneedstobeoptimized,yetwillbeneededto
studies [7,8], and individual institutional registries [9]. realizethefuturebenefitsofdeployingartificialintelligencem
However, largely scalable methods of analysis and data odelstosupportandenhanceclinicaltreatment decisions.
aggregation were not widespread. The increasing adoption of
electronic health records (EHRs) since 2008 [10], driven by
the Meaningful Use program of the HITECH Act [11], and Requirements for reproducible science in big data for
the simultaneous development and widespread adoption of artificial intelligence
standards for health data exchange have enabled the To maximize the potential of big data and artificial
aggregation of health data from multiple sources, creating a
Artificial intelligence in retina
new rich environment for big data and artificial intelligence. intelligence in ophthalmology, there are several critical
Standards for health data exchange define formats and areas of need going forward. An important minimum
fields for the storage of health data from different health standard for scientific research is producing
systems and sources; therefore, enabling the syntactic reproducibleresults.Intrainingartificialintelligence
interoperability required for machines to read data from modelsonbigdata,reproducibilityiscriticallyreliant on the
different sources using common conventions. Digital following: how the data used to train the
1040-8738 Copyright 2020 The Author(s). Published by Wolters Kluwer Health, Inc. www.co-ophthalmology.com 319
artificialintelligencemodelislabeled,theunderlying definitions for each study based on the availability and
structure and characteristics of the data, and the details of content of the data fields. Thus, there is also a need to
the model architecture. Each of these components is vital standardize how definitions are developed to label big data
areas to consider when developing and evaluating and the criteria for judging whether the definitions are
artificial intelligence models using big data. acceptable for big data that are used to train artificial
intelligence algorithms. These criteria are likely to differ
Common labels from those used to judge clinical trials because they must
Supervised artificial intelligence models are trained for balance feasibility with quality and consensus. Successful
prediction or classification tasks based on data that are scaling of big data artificial intelligence will require
labeled with the ‘ground truth’ or ‘gold standard’ leadership to standardize, where possible, the consensus
classification; thus, there is a critical need for definitions for disease diagnoses so they can be used to
ophthalmologists to develop standardized definitions and generate high quality, ‘ground truth’ labels, and standardize
classification approaches. For example, when diagnosing how consensus definitions are developed for big data studies.
glaucoma or determining glaucoma-related outcomes, To sidestep the issue of using complex ‘ground truth’
many glaucoma specialists rely upon visual field testing, disease definitions as labels, some studies use different types
disc morphology, or optical coherence tomography of proxy or surrogate outcomes which are more readily
(OCT) findings to varying degrees. Therefore, for available in the dataset. Examples include using one type of
research studies, adjudication of glaucoma outcomes is imaging modality to predict the parameters of another
often carried out by consensus of a panel of glaucoma modality: predicting various retinal nerve fiber layer OCT
specialists, with work continually ongoing to develop parameters from fundus photos [20 ,21 ] and using OCT to
& &
standardized research definitions [16]. Similarly, for predict other types of imaging outcomes
grading retinal images, some artificial intelligence studies [22,23], or predicting future results from past results, as
also use a panel of graders [17], similar to in clinical when predicting future visual fields from prior visual fields,
trials. However, utilizing a consensus or panel to grade or other biomarkers validated against true outcome
one particular dataset does not harmonize potentially [24 ,25,26]. Prediction labels may also be a proxy based on
&
idiosyncratic definitions across different datasets. cost or utilization, as when predicting high-cost or high-
While reference standards for diseases across utilization patients, which is convenient when many sources
different studies would be ideal, they are challenging to of big data in health care include only such administrative
develop and deploy in any research setting, and especially data and not clinical outcomes. However, caution must be
challenging in big data for artificial intelligence. Ideally, exercised before deploying such algorithms for decision
reference standards are based on clinical outcome, which support, as there may be underlying societal biases that
can be challenging in the case of chronic diseases [18 ]. &&
affect healthcare access, utilization, andcost[27 ].Thisspeaks
&&
Applying strict and complex grading criteria that require tothelargerconcernthat predicting health outcomes requires
the consensus of ophthalmologic specialists is very acarefulunderstanding of those outcomes in the context of
expensive in time, effort, and dollars, even for the the system being studied and recognition that predicting
relatively small scale of clinical trials, and often health outcomes in one system may not generalize to other
infeasible on the scale of modern big data. Complex systems.
diagnostic criteria often require many disparate data
components, such as history, examination, and imaging Common data
elements. Observational data sources inevitably include Data sharing is an important mechanism to ensure
patients who are missing data for some of these elements, that,aftertheexpensiveprocessofdatacurationand producing
and those missing data are unlikely to be missing at high-quality labels is completed, results from artificial
random. For example, patients may be missing OCTs, intelligence models can be reliably reproduced and the data
visual fields, or lab measurements in a pattern which is can be used by other researchers, therefore, having the
systematically related to their suspected disease [19]. widest possible impact by vastly multiplying the number of
EHRs also introduce unique challenges, since data fields studies that can be performed. The FAIR guiding principles
may differ across EHR systems, both for data input and for data stewardship and management provide a framework
storage, and providers may use different for ensuring that data are Findable, Accessible,
Interoperable, and Reusable [28].
320 www.co-ophthalmology.com Although there has been progress in development of data
interoperability standards, many challenges remain in the
terminology even within a single EHR system actual process of data sharing and data reuse. Big data today
(e.g., corneal superficial punctate keratitis vs. corneal is often observational, raw, and uncurated or minimally
punctate epithelial erosions to represent the same clinical curated. Using data exchange standards does not guarantee
finding). These variations often necessitate custom disease semantic interoperability: data from different sources may be

stored in the same format, but its meaning may differ researchers can train their
depending on the context of how and from whom it was ownweightsusingdifferentdata.Sharingthetraining code
collected. Discrepancies may exist in data formatting and/or would include sharing the many hidden parameters that
units of measure. In ophthalmology, for example, intraocular can affect model performance [32 ], such as how the data
&
pressure may be collected by a wide variety of methods, by are split into batches, initialization of terms, and rates of
different types of providers, and under different conditions, learning, thus enabling other researchers to more closely
all of which may affect the recorded data. Visual acuity is mimic the original research. The highest level of sharing,
even more widely variable, with the possibility to measure short of including the full original data, would include the
using different methods, under different lighting conditions, model architecture, training code,
with different types of correction, and with varying notation andthetrainedweightswithinferencecode,which would
(‘20/ 400’ vs. ‘400’). For data to be reusable, methods for allow other researchers to plug in their own data to make
data cleaning, harmonization, and validation should be predictions based on the original artificial intelligence
standardized and fully transparent. Careful annotation with model. While standards for how to share artificial
metadata describing the provenance of different data fields intelligence models are nascent, Open Neural Network
remains an important component of data sharing. A Exchange, a community project that specifies an open
reasonable balance must be struck between validating the format to represent machine-learning and deep-learning
underlying data, which is expected to contain some errors, models [33], provides one approach to model sharing.
and manually validating the entire dataset, which is usually Beyond enabling reproducible research, code that is made
infeasible and would defeat the advantages of data analysis at publicly available can have additional uses in
scale. contributing to research endeavors beyond its original
Finally, data sharing is subject to privacy concerns. purpose, becoming a resource for further collaborative
While imaging may be more easily innovation.
deidentifiableandshareable,EHRdataarerelativelydifficultto Many complex artificial intelligence models also
deidentify [29] and suffer from the risk of possible depend upon specific underlying hardware configurations
reidentification [30]. Furthermore, storing data in HIPAA- to execute. Standards for application program interfaces
compliant environments requires thoughtful management, (APIs) to access artificial intelligence models would be
including safeguards against data unmasking as investigators enormously helpful in enabling researchers at different
potentially use different platforms for data joining or extract sites to access common artificial intelligence models
and snapshot data from one system to another. Despite the while retaining full control over their own data.
challenges in data sharing, it is absolutely crucial to support Standardization of artificial intelligence APIs would
and incentivize mechanisms for secure data sharing to enable enable validation of artificial intelligence models
continued innovation in big data and artificial intelligence for acrossdifferentsitesandallowthetrainingofbetter models
ophthalmology in the academic sector. If the largest and based on more diverse data. Realizing the promise of
highest quality datasets become proprietary or exclusive, it artificial intelligence models also requires addressing the
will become exceedingly challenging to support continued challenges of implementing and integrating the use of
innovation outside of those groups. In the ideal world, all artificial intelligence models into clinical workflow. Here,
those with access to clinical data could too, standards for interfacing with APIs are critically
becomedatastewards,ensuringthesafetyofthedata and the use needed in order for locally generated data to be able to
Artificial intelligence in retina
of the data for the public good, enabling interface with artificial intelligence models. APIs for
thedevelopmentofknowledgeandtoolsthatbenefit future access to artificial intelligence models could and should
patients as widely as possible [31 ]. &
take advantage of the standardized data formats now
Common code and application programming interfaces available (such as FHIR or DICOM) in specifying the
When data cannot be shared due to privacy concerns, the format of data inputs for widest possible use. For
code that specifies the artificial intelligence models and example, the Substitutable Medical Applications,
the innumerable underlying data preprocessing decisions Reusable Technologies (SMART) on FHIR API is an
should be made public to best enable reproducible open, free, and standards-based
research. There are several different levels of sharing of APIthataimstoenablethecreationofanecosystem of apps
artificial built on access to EHR data [34,35]. While most SMART
intelligencemodels,enablingdifferentdegreesofreproductio on FHIR apps are not specifically in the artificial
n. A bare minimum requires sharing the model intelligence space, artificial intelligence-based
architecture as code, without the trained weights or applications can use a similar approach for access to their
parameters. The next level includes sharing the model artificial intelligence models. As with data sharing, the
architecture and the training code itself, so that other challenge in model sharing is to continue to support
academic innovation by promoting openness, Health & Science University, Portland, Oregon; James D.
interoperability, and standardized APIs, thereby accruing Brandt, MD, Department of Ophthalmology & Vision
the widest possible benefits of artificial intelligence and Science, University of California, Davis; Kelly Chung, MD,
preventing artificial intelligence models from becoming Oregon Eye Specialists, Portland, Oregon; Nikolas London,
‘locked down’ into separate inaccessible MD, Retina Consultants San Diego, Poway, California; April
silosaswehaveseenwith EHRs.In ophthalmology, in Maa, MD, Department of Veteran Affairs, VISN 7 Regional
particular, efficient real-time inference workflows are Telehealth Services, Atlanta, Georgia; Department of
important as the time between image acquisition and need Ophthalmology, Emory University School of Medicine,
for artificial intelligence Atlanta, Georgia; S.P., MD, MS, Department of
inferenceresultsarecomparativelyshortcompared with Ophthalmology, Byers Eye Institute, Stanford University;
other fields of medicine such as radiology or pathology Jessica Peterson, MD, MPH, American Academy of
where interpretation traditionally happens Ophthalmology, San Francisco, California. A.Y.L., US FDA
asynchronously. (E), Genentech (C), Topcon (C), Verana Health (C),
Santen (F), Novartis (F), Carl Zeiss Meditec (F). April Maa,
Click Therapeutics (C), Warby Parker (C). S.P.,
CONCLUSION VeranaHealth(C),AcumenLLC(C).AAOTaskForceon
The environment for big data and artificial intelligence is Artificial Intelligence Members: Michael F. Chiang, MD
evolvingrapidly,supportedbythe adoption of EHRs, (Chair), Departments of Ophthalmology and Medical
standards for health data exchange, and modern Informatics & Clinical Epidemiology, Casey Eye Institute,
technologies for storing and processing large datasets. Oregon Health & Science University, Portland, Oregon,
Future requirements for big data and artificial intelligence Michael D. Abra`moff, MD, PhD, Retina Service,
include fostering reproducible science, continuing open Departments of Ophthalmology and Visual Sciences,
innovation, and supporting the clinical use of artificial Electrical and Computer Engineering, and Biomedical
intelligence. Researchers, academic institutions, Engineering, University of Iowa, Iowa City, Iowa; IDx,
healthcare companies, and regulatory agencies should Coralville, Iowa; J. Peter Campbell, MD, MPH, Department
work together to achieve these goals by promoting data of Ophthalmology, Oregon Health & Science University,
label standards, data sharing, standards for sharing Portland, Oregon; Pearse A. Keane, MD, FRCOphth,
artificial intelligence model architecture, and accessible Institute of Ophthalmology, University College London,
code and APIs. London; Medical Retina Service, Moorfields Eye Hospital
NHS Foundation Trust, London; A.Y.L., MD, MSCI,
Acknowledgements Department of Ophthalmology, University of Washington,
The authors would like to thank Dr Marian Blazes Seattle, Washington; Flora C. Lum, MD, American
(University of Washington) for her help in editing this Academy of Ophthalmology, San Francisco, California;
article. Louis R. Pasquale, MD, Eye and Vision Research Institute,
Icahn School of Medicine at Mount Sinai, New York, New
Financial support and sponsorship York; Michael X. Repka, MD, MBA, Wilmer Eye institute,
NIH/NEI K23EY029246, NIH P30EY10572; an Johns Hopkins University, Baltimore, Maryland; Rishi P.
unrestricted grant from Research to Prevent Blindness; Singh, MD, Cole Eye Institute, Cleveland Clinic, Cleveland,
T15 LM 007033 (S.Y.W.). The sponsors/funding Ohio; Daniel Ting, MD, PhD, Singapore National Eye
organizations had no role in the design or conduct of this Center, DukeNUS Medical School, Singapore. Michael D.
research. Abramoff, IDx (I,F,E,P,S), Alimera (F). J. Peter Campbell,
Genentech (F). A.Y.L., US FDA (E), Genentech (C), Topcon
322 www.co-ophthalmology.com (C), Verana Health (C), Santen (F), Novartis (F), Carl Zeiss
Meditec (F). Louis R. Pasquale, Verily (C), Eyenovia (C),
Conflicts of interest Nicox (C), BauschþLomb (C), and Emerald Bioscience (C).
A.Y.L. is an employee of the US FDA, has received research Pearse A. Keane, DeepMind Technologies (C), Roche (C),
funding from Novartis, Santen, Carl Zeiss Meditec, and has Novartis (C), Apellis (C), Bayer (F), Allergan (F), Topcon
acted as a consultant for Genentech, Topcon, and Verana (F), Heidelberg Engineering (F), Daniel Ting,
Health. The opinions expressed in this article are the author’s EyRIS (IP), Novartis (C), Ocutrx (I, C), Optomed (C),
own and do not reflect the view of the FDA or the US Rishi Singh Alcon (c), Genentech (C), Novartis (C), Apellis
government. AAO Medical Information Technology (F), Bayer (C), Carl Zeiss Meditec (C), Aerie (F), Graybug
Committee Members: A.Y.L., MD, MSCI (Co-Chair), (F), Regeneron (C).
Department of Ophthalmology, University of Washington,
Seattle, Washington; Thomas S. Hwang, MD (Co-Chair),
Department of Ophthalmology, Casey Eye Institute, Oregon

24. Wen JC, Lee CS, Keane PA, et al. Forecasting future Humphrey Visual Fields &
REFERENCES AND RECOMMENDED using deep learning. PLoS One 2019; 14:e0214875.
READING Example of predicting future imaging outcomes using previous imaging outcomes.
Papers of particular interest, published within the annual period of review, have been 25. Park K, Kim J, Lee J. Visual field prediction using recurrent neural network. Sci
highlighted as: & of special interest Rep 2019; 9:8385.
&&
of outstanding interest 26. [No authors listed]. Fundus photographic risk factors for progression of
diabetic retinopathy. ETDRS report number 12. Early Treatment Diabetic
Retinopathy Study Research Group. Ophthalmology 1991; 98(5 Suppl): 823–
1. Mooney SJ, Westreich DJ, El-Sayed AM. Commentary: epidemiology in the era of 833.
big data. Epidemiology 2015; 26:390–394. 27. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an
2. Topol EJ. High-performance medicine: the convergence of human and artificial &&
algorithm used to manage the health of
intelligence. Nat Med 2019; 25:44–56. populations. Science 2019; 366:447–453.
3. Qiu M, Wang SY, Singh K, Lin SC. Racial disparities in uncorrected and An important analysis of the pitfalls of using as prediction outcomes proxy variables
undercorrected refractive error in the United States. Invest Ophthalmol Vis Sci such as cost which are affected by society inequities.
2014; 55:6996–7005. 28. Wilkinson MD, Dumontier M, Aalbersberg IJJ, et al. The FAIR Guiding
4. Stein JD, Lum F, Lee PP, et al. Use of healthcare claims data to study patients with Principles for scientific data management and stewardship. Sci Data 2016;
ophthalmologic conditions. Ophthalmology 2014; 121:1134–1141. 3:160018.
5. Chew EY, Clemons T, SanGiovanni JP, et al., AREDS2 Research Group. The Age- 29. Yogarajan V, Pfahringer B, Mayo M. A review of automatic end-to-end
Related Eye Disease Study 2 (AREDS2): study design and baseline characteristics deidentification: is high accuracy the only metric? Appl Artif Intell 2020;
(AREDS2 report number 1). Ophthalmology 2012; 119:2282–2289. 34:251–269.
6. Age-Related Eye Disease Study Research Group. The Age-Related Eye Disease 30. Simon GE, Shortreed SM, Coley RY, et al. Assessing and minimizing
Study (AREDS): design implications. AREDS report no 1. Control Clin Trials 1999; reidentification risk in research data derived from healthcare records. EGEMS
20:573–600. (Wash DC) 2019; 7:6.
7. Kang JH, Wu J, Cho E, et al. Contribution of the nurses’ health study to the 31. Larson DB, Magnus DC, Lungren MP, et al. Ethics of using and sharing clinical
epidemiology of cataract, age-related macular degeneration, and glaucoma. Am J &
imaging data for artificial intelligence: a proposed framework. Radiology 2020;
Public Health 2016; 106:1684–1689. 295:675–682.
8. Mitchell P, Smith W, Attebo K, Wang JJ. Prevalence of age-related maculopathy A proposed ethical framework around big data use and sharing.
in Australia. The Blue Mountains Eye Study. Ophthalmology 1995; 102:1450– 32. Beam AL, Manrai AK, Ghassemi M. Challenges to the reproducibility of
1460. &
machine learning models in health care. JAMA
9. Tan JCK, Ferdi AC, Gillies MC, Watson SL. Clinical registries in ophthalmology. 2020. doi:10.1001/ jama.2019.20866.
Ophthalmology 2019; 126:655–662. A description of the many factors affecting reproducible science in machine learning.
10. Office of the National Coordinator for Health Information Technology. 33. ONNX. Home. https://onnx.ai/index.html. [Accessed 30 March 2020].
Officebased physician electronic health record adoption. Health IT Quick-Stat
34. SMART on FHIR. https://docs.smarthealthit.org/. [Accessed 1 April 2020].
#50, 2019. https://dashboard.healthit.gov/quickstats/pages/physician-ehr-
35. Mandl KD, Kohane IS. No small change for the health information economy. N
adoption-trends.php. [Accessed 16 March 2020].
Engl J Med 2009; 360:1278–1281.
11. Adler-Milstein J, Jha AK. HITECH act drove large gains in hospital electronic health
record adoption. Health Aff 2017; 36:1416–1422.
12. Parisot C. The DICOM standard. Int J Card Imaging 1995; 11:171–177.
13. Bender D, Sartipi K. HL7 FHIR: an agile and RESTful approach to healthcare
information exchange. In: Proceedings of the 26th IEEE International Symposium
on Computer-Based Medical Systems; 2013:326-331.
14. Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning with &&
electronic health records. NPJ Digit Med 2018; 1:18.
An example of the potential for the Fast Healthcare Interoperability Resources
standard for electronic health records enabling the development of deep-learning
models across two major hospital systems for a variety of health outcomes.
15. Panth D, Mehta D, Shelgaonkar R. A survey on security mechanisms of leading
cloud service providers. Int J Comput Appl Technol 2014; 98:34–37.
16. Iyer J, Vianna JR, Chauhan BC, Quigley HA. Toward a new definition of
glaucomatous optic neuropathy for clinical research. Curr Opin Ophthalmol 2020;
31:85–90.
17. Keel S, Li Z, Scheetz J, et al. Development and validation of a deep-learning
algorithm for the detection of neovascular age-related macular degeneration
from colour fundus photographs. Clin Experiment Ophthalmol 2019; 47:1009–
1018.
18. Abra`moff MD, Tobey D, Char DS. Lessons learned about autonomous AI: &&
finding a safe, efficacious, and ethical path through the development process.
Am J Ophthalmol 2020; 214:134–142.
Overview of broad issues surrounding development and deployment of autonomous
artificial intelligence in clinical workflow.
19. Belin TR. Missing data: what a little can do, and what researchers can do in
response. Am J Ophthalmol 2009; 148:820–822.
20. Thompson AC, Jammal AA, Medeiros FA. A deep learning algorithm to
&
quantify neuroretinal rim loss from optic disc photographs. Am J Ophthalmol 2019;
201:9–18.
Examples of machine learning using one imaging modality (photography) to
predict another imaging modality [optical coherence tomography (OCT)]. 21.
Medeiros FA, Jammal AA, Thompson AC. From machine to machine: an OCT-
&
trained deep learning algorithm for objective quantification of glaucomatous
damage in fundus photographs. Ophthalmology 2019; 126:513–521. Examples of
machine learning using one imaging modality (photography) to predict another
imaging modality (OCT).
22. Lee CS, Tyring AJ, Wu Y, et al. Generating retinal flow maps from structural
optical coherence tomography with artificial intelligence. Sci Rep 2019;
9:5694.
23. Kihara Y, Heeren TFC, Lee CS, et al. Estimating retinal sensitivity using optical
coherence tomography with deep-learning algorithms in macular
telangiectasia type 2. JAMA Netw Open 2019; 2:e188029.

Big Data Requirements For Artificial Intelligence - Review

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Big Data Requirements For Artificial Intelligence - Review

Uploaded by

Copyright:

Available Formats

REVIEW

Big data requirements for artificial

INTRODUCTION and by variety, which in health care often implies

Volume 31 Number 5 September 2020

a Department of Ophthalmology, Byers Eye Institute, Stanford University,

Volume 31 Number 5 September 2020

architecture. Training an artificial intelligence model on such

Volume 31 Number 5 September 2020

Volume 31 Number 5 September 2020

You might also like