You are on page 1of 93

INTERNATIONAL TELECOMMUNICATION UNION

FG-AI4H-I-036
TELECOMMUNICATION
STANDARDIZATION SECTOR ITU-T Focus Group on AI for Health
STUDY PERIOD 2017-2020 Original: English
WG(s): Plenary E-meeting, 7-8 May 2020
DOCUMENT
Source: Editors
Title: Updated DEL2.2: Guidelines for AI based medical device: Regulatory
requirements (Draft: April 2020)
Purpose: Discussion
Contact: Luis Oala Email: luis.oala@hhi.fraunhofer.de
HHI Fraunhofer
Germany
Contact: Christian Johner Email: christian.johner@johner-institut.de
Johner Institute for Healthcare IT
Germany
Contact: Pradeep Balachandran Email: pbn.tvm@gmail.com
Technical Consultant (eHealth)
India

Abstract: This document contains the latest draft of the FG-AI4H deliverable DEL2.2
"Guidelines for AI based medical device: Regulatory requirements". This
deliverable defines a set of guidelines intended to serve the AI solution
developers/‌manufacturers on how to do conduct a comprehensive requirements
analysis and to streamline the conformity assessment procedures to ensure
regulatory compliance for the AI based Medical Devices (AI-MD).
-3-
FG-AI4H-I-036

I n t e r n a t i o n a l T e l e c o m m u n i c a t i o n U n i o n

ITU-T FG-AI4H
Deliverable
TELECOMMUNICATION
STANDARDIZATIONSECTOR
OFITU
(draft 2020-04-14)

FG AI4H DEL2.2
Guidelines for AI Based Medical Device-
Regulatory Requirements
-4-
FG-AI4H-I-036
Summary
This technical paper defines a set of guidelines intended to serve the AI solution developers/
manufacturers on how to do conduct a comprehensive requirements analysis and to streamline the
conformity assessment procedures to ensure regulatory compliance for the AI based Medical
Devices (AI-MD). This set of guidelines gives prime priority to the factor of patient safety and
focuses on a streamlined process for risk minimization and quality assurance for AI based health
solutions. The proposed set of guidelines adopts, extends and leverages the best practices and
recommendations provided by international medical device regulatory agencies such as the IMDRF
and the FDA. These guidelines are devoid any legally binding or statutory requirements applicable
to any specific regulatory framework or specific geographic jurisdiction
Keywords
[Regulatory Checklist, Software-as-a-Medical Device, AI based Medical Devices]
Change Log
This document contains Version 1.0 of the FG-AI4H deliverable on "Guidelines for AI based
Medical Devices - Regulatory Requirements" approved at the ITU-T Focus Group on Artificial
Intelligence for Health meeting held in Geneva, dd-ddMMMM YYYY.

Editors: Luis Oala Email: luis.oala@hhi.fraunhofer.de


HHI Fraunhofer, Germany
Christian Johner Email:christian.johner@johner-
Johner Institute for Healthcare IT, Germany institut.de
PradeepBalachandran Email: pbn.tvm@gmail.com
Technical Consultant(eHealth), India
-6-
FG-AI4H-I-036

Contributors: (in alphabetical order)

Alixandro Werneck Leite Email:


alixandrowerneck@outlook.com
Machine Learning Laboratory of Finance
and Organizations – LAMFO - Brazil

Andrew Murchison Email: agmurchison@gmail.com


Company Oxford University Hospitals NHS
Foundation Trust, United Kingdom

Email: pgg@has.com
Peter. G. Goldschmidt
WORLD DEVELOPMENT GROUP, Inc.,
USA

Email: xushan@caict.ac.cn
Shan Xu
China Academy of Information and
Communications Technology (CAICT)
China
Email: sven.piechottka@merantix.com
Sven Piechottka
Merantix Healthcare
Germany
-7-
FG-AI4H-I-036

CONTENTS
Page

1 Scope.......................................................................................................................................10

2 References...............................................................................................................................10

3 Terms and definitions..............................................................................................................11


3.1 Terms defined elsewhere..........................................................................................11
3.2 Terms defined here....................................................................................................12

4 Abbreviations and acronyms...................................................................................................12

5 Regulatory guidelines formulation: baseline references.........................................................12

6 Proposed roadmap for regulatory requirements guidelines.....................................................13


6.1 Guidelines: applicability scope.................................................................................13
1.1.1 Scope of regulation.......................................................................................13
1.1.2 Scope of product...........................................................................................13
1.1.3 Scope of application.....................................................................................13
1.2 Guidelines: aim & objectives....................................................................................13
1.3 Guidelines: target user class / roles...........................................................................14
1.4 Requirements assessment model: a product lifecycle development based
approach....................................................................................................................14

2 Regulatory guidelines: proposed structure..............................................................................17

3 Regulatory guidelines: requirements checklist.......................................................................19


3.1 Pre-market requirements...........................................................................................20
3.1.1 Intended use and stakeholder requirements..................................................22
3.1.2 Product and software requirements..............................................................29
3.1.3 Data management requirements...................................................................34
3.1.4 Model development requirements................................................................41
3.1.5 Product development requirements..............................................................46
3.1.6 Product validation requirements...................................................................52
3.1.7 Product release requirements........................................................................55
3.2 Post-market requirements.........................................................................................56
3.2.1 Production, distribution & installation requirements....................................56
3.2.2 Post-market surveillance requirements.........................................................58

Annex A: IMDRF Essential Principles..............................................................................................63

Annex B: IMDRFSaMD Risk Categorization Framework................................................................70


B.1 Criteria for determining the SaMDcategory.............................................................70
B.2 Levels of autonomy...................................................................................................71
-8-
FG-AI4H-I-036

Annex C: Johnerregulatory guidelines for AI- for medical devices...................................................72

Annex D: FG-AI4H data and AI solution quality assessment criteria...............................................73

Annex E: ITU ML5G high-level requirements mapping to AI for health requirements....................80

Annex F: DIN SPEC 92001-AI devices life cycle processes requirements.......................................83

Annex G: AI4H Project Deliverable Reference.................................................................................92

List of Tables
Page
Table 1: Process requirements............................................................................................................20
Table 2: Competencies requirements.................................................................................................21
Table 3: Intended use requirements....................................................................................................22
Table 4: Intended users and intended context of use requirements....................................................24
Table 5: Stakeholder requirements.....................................................................................................25
Table 6: Inputs to risk management and clinical evaluation requirements........................................27
Table 7: Functionality and performance requirements.......................................................................29
Table 8: Userinterface requirements...................................................................................................31
Table 9: Additional software requirements........................................................................................31
Table 10: Risk management & clinical evaluation requirements.......................................................32
Table 11: Data collection requirements..............................................................................................34
Table 12: Data annotation requirements.............................................................................................37
Table 13: Data pre-processing requirements......................................................................................38
Table 14: Documentation and version control requirements..............................................................40
Table 15: Model preparation requirements........................................................................................41
Table 16: Model training requirements..............................................................................................42
Table 17:Model evaluation requirements...........................................................................................43
Table 18: Model documentation requirements...................................................................................45
Table 19: Software development requirements..................................................................................47
Table 20: Accompanying materials requirements..............................................................................49
Table 21: Usability validation requirements......................................................................................52
Table 22: Clinical evaluation requirements........................................................................................53
Table 23: Product release requirements..............................................................................................55
Table 24: Production, distribution and installation requirements.......................................................56
Table 25:Post-market surveillance requirements...............................................................................58
Table A.1: IMDRFEP 5.1 – General..................................................................................................64
-9-
FG-AI4H-I-036

Table A.2: IMDRFEP 5.2 – Clinical evaluation................................................................................66


Table A.3: IMDRFEP 5.8 – Medical devices and IVD medical devices that incorporate software or
are software as a medical device........................................................................................................66
Table A.4: IMDRFEP 5.10 – Labelling.............................................................................................67
Table A.5: IMDRFEP 5.12 – Protection against the risks posed by medical devices and IVD
medical devices intended by the manufacturer for use by lay users..................................................67
Table A.6: IMDRFEP7.2 – Performance characteristics...................................................................68
Table B.1: IMDRFSaMD risk categories...........................................................................................70
Table B.2: IMDRFSaMD risk categories(revised).............................................................................71
Table D.1:FG-AI4H data and AI solution quality assessment criteria...............................................74
Table G: AI4H Project Deliverable Reference ID..............................................................................92

List of Figures
Page
Figure 1: AI-MD Classification Criteria & Scope..............................................................................13
Figure 2(a): Regulatory roadmap-AI-medical device scope..............................................................14
Figure 2(b): Regulatory roadmap- AI4H project deliverables scope.................................................15
Figure 3: Product development life-cycle process (V-model)............................................................16
Figure 4: AI Software life-cycle diagram...........................................................................................16
Figure 5: Structure of the regulatory guidelines.................................................................................18
ITU-T FG-AI4H Deliverable 2.2

Guidelines for AI based medical devices-regulatory requirements

Summary
Artificial intelligence based technologies find extensive use in medical applications and the
proliferation of AI at scale in global health holds great potential to significantly improve
accessibility, quality, and cost-effectiveness of healthcare delivery and outcomes. Regulation plays
an important role in ensuring the safety of patients and users, and in the commercialization and
market acceptance of these AI based medical devices (AI-MD).Therefore, a streamlined and
systematic regulatory compliance process can help to expedite the regulatory approval and to
reduce the time-to-market for these products. Software is an important aspect of AI based medical
devices and as per The International Medical Device Regulators Forum (IMDRF), a 'Software as a
Medical Device (SaMD)' is defined as a software intended to be used for one or more medical
purposes that perform these purposes without being part of a hardware medical device, where
'medical purposes' include treatment, diagnosis, cure, mitigation, or prevention of disease or other
conditions.
Since Artificial intelligence based technologies are data driven, they have the ability to learn from
real-world feedback data and thus improve and adapt their performance over time. Because of these
characteristics, they are identified and classified as belonging to the class of continuous learning
systems. The complexity introduced by these changes affects the clinical functionality, clinical
performance specifications and safety of the medical device, and may introduce new risks or even
lead to modification of existing risks. The fact that many of the AI processes are not transparent
(black box phenomenon) also places barriers to acceptance. These characteristics raise important
technological, methodological, ethical, privacy, security, and regulatory issues and there is an
absolute need for reasonable assurance mechanisms to maintain and/or improve the performance,
compatibility, safety and effectiveness of the medical devices.
Apart from these device-oriented issues, there are other challenges that include a lack of universally
accepted policies and guidelines for regulation of AI based medicals devices, which create barriers
for these type of devices to scale-up at the global level. Many medical devices companies do not
have proper awareness of regulatory standards and thus fail to assess the potential implications of
safety, ethical and legal risks There is a need for proper guidance mechanisms to educate and train
medical device manufactures to work to regulatory guidelines applicable to AI based devices.
Current regulatory requirements are generally designed to evaluate more conventional medical
devices, and there is also need for regulatory policies and guidelines to be tailored for AI based
medicals devices. Regulatory frameworks need to adopt a full product lifecycle approach that
facilitates continual product improvement in an iterative and adaptive manner in conformance to the
appropriate standards and regulations. Furthermore, regulators and manufacturers must establish a
system to ensure transparency and accountability of all the processes involved in AI medical device
development. To aid manufactures meet these requirements, AI4H FG proposes a comprehensive
set of guidelines on how to conduct a regulatory review process for pre-market and post-market
product performance assessment and reporting. The main aim of these guidelines is to safeguard
patient safety as its first priority through a streamlined process for manufacturers that will help
ensure that products benefit patients by promoting health and minimizing risk. The proposed set of
guidelines adopts, extends and leverages best practices and recommendations provided by the
international medical device regulatory agencies such as the IMDRF and the FDA.
- 11 -
FG-AI4H-I-036

1 Scope
This document defines a set of guidelines intended to serve the target audience with adequate
guidance on how to do conduct a comprehensive requirements analysis and to streamline the
conformity assessment procedures to ensure regulatory compliance for the AI for Medical Devices
(AI-MD)
Primary and secondary audience…
The scope of AI-MD, in this context, DO include (a) medical devices with or without enforcement
of regulations, (b) Software-as-a-Medical Device (SaMD) or Software-in-a-Medical Device (SiMD)
and (c) healthcare applications intended to improve medical outcomes or efficiency of healthcare
system
The scope of AI-MD, in this context, DOES NOT include software applications for (a)healthcare
facility administrative support, (b) for maintaining or encouraging healthy lifestyle, behaviour and
wellness
This set of guidelines complements existing quality practices, recommendations and standards
developed by national and international medical device regulatory organizations including ISO,
IEC, FDA, etc. within the AI-MD regulatory domain and is DOES NOT replace or conflict with
their respective regulatory processes and standards
This set of guidelines DOES NOT include any legally binding or statutory requirements applicable
to any specific regulatory framework or specific geographic jurisdiction

2 References
The following list of reference documents were reviewed as part of broad literature survey towards
the design of the proposed regulatory requirements guidelines, considering aspects of regulations,
standards, guidelines, best-practices, directives and laws that are relevant in the context of AI-MD

[EU-IVDR], [ EU-MDR (2017/745) ] General-Definition of "medical device", General safety


and performance requirements, Technical documentation on design,
functions, components, Intended use, Intended users, Clinical evaluation,
Post-market surveillance, Vigilance, etc.
[FDA 21 CFR] FDA 21 CFR part 820, Quality System Regulations
[FDA] Guidance "General Principles of Software Validation"
[FDA] FDA's "Proposed Regulatory Framework for Modifications to Artificial
Intelligence / Machine Learning (AI/ML) Based Software as Medical
Device
[GDPR] European General Data Protection Regulation
[IEC 62304] IEC 62304:2006 + A1:2015, "Medical device software – Software life cycle
processes"
[IEC 62366] IEC 62366-1:2015, "Medical devices – Part 1: Application of usability
engineering to medical devices"
[IEC 82304] IEC 82304-, “ Health software — Part 1: General requirements for product
safety”
[IMDRF/GRRP WG/N47 FINAL: 2018] “Essential Principles of Safety and Performance of
Medical Devices and IVD Medical Devices”
[ISO 13485] ISO 13485:2016, "Medical devices — Quality management systems —
Requirements for regulatory purposes"
- 12 -
FG-AI4H-I-036

[ISO 14971] ISO 14971:2019, "Medical devices – Application of risk management to


medical devices"
[ISO 9241-11] ISO 9241-11, " Ergonomics of human-system interaction —Part 11:
Usability: Definitions and concepts"
[ISO 9241-210] ISO 9241-210, " Ergonomics of human-system interaction —Part 210:
Human-centred design for interactive systems"
[ISO 14155] ISO 14155, " Clinical investigation of medical devices for human subjects
— Good clinical practice"
[ISO/IEC 27000] ISO/IEC 27000, " Information technology — Security techniques —
Information security management systems — Overview and vocabulary"
[ISO/IEC 27002] ISO/IEC 27002, "Information technology — Security techniques — Code of
practice for information security controls"
[ISO/IEC 27002:2013] ISO/IEC 27002:2013, "Information technology — Security techniques —
Code of practice for information security controls, TECHNICAL
CORRIGENDUM 1"
[ISO/IEC 27002:2013] ISO/IEC 27002:2013, "Information technology — Security techniques —
Code of practice for information security controls, TECHNICAL
CORRIGENDUM 2"

3 Terms and definitions

3.1 Terms defined elsewhere


This document uses the following terms defined elsewhere:
3.1.1 Clinical Evaluation [GHTF/SG5/N1R8:2007]: The assessment and analysis of clinical
data pertaining to a medical device to verify the clinical safety and performance of the device when
used as intended by the manufacturer
3.1.2 In Vitro Diagnostic (IVD) Medical Device [GHTF/SG1/N71:2012]:A medical device,
whether used alone or in combination, intended by the manufacturer for the in-vitro examination of
specimens derived from the human body solely or principally to provide information for diagnostic,
monitoring or compatibility purposes
3.1.3 Life-Cycle [ISO/IEC Guide 51:2014]: All phases in the life of a medical device, from the
initial conception to final decommissioning and disposal
3.1.4 Medical Device [GHTF/SG1/N71:2012]: Any instrument, apparatus, implement, machine,
appliance, implant, reagent for in vitro use, software, material or other similar or related article,
intended by the manufacturer to be used, alone or in combination, for human beings, for one or
more of the specific medical purpose(s) of: a) diagnosis, prevention, monitoring, treatment or
alleviation of disease, b) diagnosis, monitoring, treatment, alleviation of or compensation for an
injury, c) investigation, replacement, modification, or support of the anatomy or of a physiological
process, d) supporting or sustaining life, e) control of conception, f) disinfection of medical devices,
g) providing information by means of in vitro examination of specimens derived from the human
body; and does not achieve its primary intended action by pharmacological, immunological or
metabolic means, in or on the human body, but which may be assisted in its intended function by
such means
- 13 -
FG-AI4H-I-036

3.1.5 Software as a Medical Device [IMDRF/SaMD WG/N12FINAL:2014]:Software intended


to be used for one or more medical purposes that perform these purposes without being part of a
hardware medical device
3.1.6 Software Validation [IEEE-STD-610]: The process of evaluating software during or at
the end of the development process to determine whether it satisfies specified requirements
3.1.7 Software Verification [IEEE-STD-610]: The process of evaluating software to determine
whether the products of a given development phase satisfy the conditions imposed at the start of
that phase

3.2 Terms defined here


This document defines the following terms:
3.2.1 term [reference]: definition
3.2.2 term [reference]: definition

4 Abbreviations and acronyms


This document uses the following abbreviations and acronyms.
AI Artificial intelligence
AI4H Artificial intelligence for health
AI-MD Artificial Intelligence based Medical Devices
DAISAM Data and AI Solution Assessment Methods
EP Essential Principle
FDA Food and Drug Administration
GDPR General Data Protection Regulation
HIPAA Health Insurance Portability and Accountability Act
IMDRF International Medical Device Regulators Forum
ITU International Telecommunication Union
IVD In vitro diagnostics
MDD Medical device directives
MDR (2017/745) Medical device regulation
SaMD Software-as-a-medical device
SiMD Software-in-a-medical device
WG Working group
WHO World Health Organization

5 Regulatory guidelines formulation: baseline references


The proposed guidelines were formulated based on a critical review of existing global regulations
and standards for AI related technologies in medical applications. The critical review included
identifying the gaps of existing regulatory requirements assessments methods and incorporating a
quality risk management approach with necessary monitoring and control parameters for improved
safety and efficiency of AI-MDs. A detailed list of regulatory references considered towards the
formulation of the new guidelines is included in Annexes A to F
- 14 -
FG-AI4H-I-036

6 Proposed roadmap for regulatory requirements guidelines

6.1 Guidelines: applicability scope


For defining the applicability scope of the proposed guidelines, classification criteria based on a)
scope of regulation, b) scope of product and c) scope of application are used. The classification
criteria & scope of the proposed guidelines is illustrated in Figure 1.

Figure 1: AI-MD Classification Criteria & Scope

1.1.1 Scope of regulation


Medical device:
– with enforcement of regulations
– without enforcement of regulations

1.1.2 Scope of product


– Software is the product ->Standalone Software
– Software is embedded in product

1.1.3 Scope of application


In healthcare
– To improve medical outcome (e.g. to support diagnosis, treatment, prevention, monitoring
and prediction of diseases and injuries)
– To improve efficiency of healthcare system (e.g. streamline processes)

1.2 Guidelines: aim & objectives


Aim
– to help manufacturers develop AI-based products conforming to the law and bring them to
market quickly and safely
– to help internal and external auditors test the legal conformity of AI-based medical devices
and the associated life-cycle process.
- 15 -
FG-AI4H-I-036

Objectives
The objective of this guideline is to provide target users with instructions and to provide them with
a concrete checklist to
– understand what the expectations of the regulatory bodies are,
– to promote step-by-step implementation of safety and effectiveness of Artificial Intelligence /
Machine Learning (AI/ML) Based Software as Medical Device,
– to compensate for the lack of a harmonized standard (in the interim) to the greatest extent
possible.

1.3 Guidelines: target user class / roles


The following user classes /roles are deemed responsible for using the guidelines
– Quality Assurance Auditors / Managers
– Developers
– Testers
– Data scientists
– Clinical specialists
– Physicians
– Product Managers
– Medical Device Consultants
– Regulatory specialists
– Risk assessment specialists
– Service and support providers

1.4 Requirements assessment model: a product lifecycle development based approach

Figure 2(a): Regulatory roadmap-AI-medical device scope


- 16 -
FG-AI4H-I-036

Figure 2(a) shows the generic as well as the AI specific aspects that need to be considered under the
regulatory roadmap of medical devices. From Figure 2(a), it can be inferred that AI-MD, as
continuous learning or adaptive systems, are subject to modifications throughout its lifecycle and
this result in unforeseen outcomes for the device including change of core device functionality and
risk levels. These aspects pose additional challenges to the device manufacturers in terms of
managing rapid development cycles, frequent software update and distribution cycles. Hence
change management considerations tailored for AI-MDs are expected to have appropriate level of
controls to manage these changes

Figure 2(b): Regulatory roadmap- AI4H project deliverables scope

Figure 2(b) shows the relevant AI-MD specific deliverables produced as part of the AI4H FG
project. It can be seen that these AI4H deliverables include the necessary product development life-
cycle processes that support the regulatory roadmap scope for AI-MDs. Document identifiers of
AI4H deliverables are listed in Table-G- AI4H Project Deliverable Reference ID (Annex G) for
further reference.
To facilitate continual product improvement in an iterative and adaptive manner with conformance
to appropriate standards and regulations, it becomes imperative for any regulatory framework to
establish a system that can ensure transparency and accountability of all the life cycle processes
involved in AI-MD development. A brief rationale is provided here on the need for a product
lifecycle development process-oriented approach that forms the basis of the proposed regulatory
requirements guidelines
- 17 -
FG-AI4H-I-036

Figure 3: Product development life-cycle process (V-model)

Figure 4: AI Software life-cycle diagram


Figure 3 shows the V-model, which is widely accepted as the default product development lifecycle
model in software engineering practice.
– A V- model based regulatory roadmap is proposed with an aim to maximize the completeness
and coverage of various regulatory needs / aspects across the AI-MD life cycle processes
-requirements, design, development, testing, deployment, maintenance, etc.
– The V- model supported by the principles of transparency and real-world performance
monitoring, conformance assessments can be performed to measure and trace the
compliance / deviation of in-house processes with standardized regulatory assessment
procedures
- 18 -
FG-AI4H-I-036

– Apart from compliance verification, V-model gives thrust to software process improvement
and supports integration of best practice for process improvement to achieve improved
software quality, performance, safety, and effectiveness of medical device.

2 Regulatory guidelines: proposed structure


Figure 5 shows the proposed guidelines structure. This guideline follows a process-oriented
approach, whereby all relevant processes and phases of the AI-MD life cycle are considered such
as:
1. Research and development
2. Data management
3. Post-market surveillance
Accordingly, the guideline does not set forth specific requirements for the products, but for the
processes applicable to pre-market evaluation, which includes clinical evaluation and post-
market evaluation. The requirements scope includes the following:
1. General requirements
2. Requirements for product development
a. Intended use
b. Software requirement specification
c. Data management
d. Model development
e. Product development
f. Product release
Requirements for phases following development
- 19 -
FG-AI4H-I-036

Figure 5: Structure of the regulatory guidelines


- 20 -
FG-AI4H-I-036

3 Regulatory guidelines: requirements checklist


A regulatory requirement assessment checklist is proposed as a standard assessment and reporting
tool to aid regulatory auditing/‌review process. Checklist enlists an orderly set of verification and
validation procedures on how to conduct a comprehensive review covering all relevant aspects of
the quality assurance pipeline.
The quality criteria for a checklist item include the following:
– It is atomic (not a combination)
– It can be checked within seconds or maximum a few minutes
– The result is binary i.e. either 'Yes' or 'No'
– It clearly specifies the necessary evidence
– It is understandable and verifiable also for non-experts
– It has to match / prove the requirement
3.1 Pre-market requirements

Table 1: Process requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)applicable
The manufacturers should – There is at least one SOP1 Full Lifecycle EU MDR (2017/745)
establish a quality covering the design& Article 10.9
management (QM) System development process including
that covers all life cycle verification and validation ISO 13485 e.g. clause7.1
phases. – There is /areSOP(s) covering the ISO 13485 clause4.1.6
post-market surveillance and
vigilance
– There is a SOP covering risk
management
– There is a SOP covering
Computerized Systems Validation
(CSV)
– There is a SOP covering the data
management (process)
– There is / are SOP(s) covering
software delivery, service,
installation, decommissioning
– There is a SOP covering customer
communication including
handling of customer complaints
The manufacturer should – There is a product specific Full Lifecycle MDR (2017/745) Annex
compile all product development plan (including I (3)
specific plans as required verification and validation) MDR (2017/745) Annex
by respective regulations. – There is a product specific post- III (1.3)
market surveillance plan IEC 62304 clause5.1
– There is a product specific ISO 14971:2019 (4.2)

1
Standard Operating Procedure. All SOPs have to be approved and be under version control.
- 22 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)applicable
clinical evaluation plan
– There is a product specific
documented risk management
plan

Table 2: Competencies requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There is a list that specifies roles Examples for domain experts Intended Use and ISO 13485 clause 5.5.1
identify the roles inside the and responsibilities inside the are physicians, clinicians, Stakeholder EU MDR
scope of its QM system manufacturer's organization nurses, lab technicians, Requirements (2017/745)Article 10.9
that is directly or indirectly involved in its product life cycle pharmacists etc. Specifications
concerned with AI. activities Additional roles may include
– These roles include software the following:
developers, software testers, data – Regulatory affairs and
scientists, experts of clinical quality managers
evaluations, risk managers, – Product managers
usability engineers, domain
– Medical device consultants
experts
– Service technicians e.g.
update, upgrade,
configuration, installation,
capturing audit logs, etc.
– Support staff
The manufacturer should – There are documented Examples of competencies are Intended Use and ISO 13485 clause 6.2.
ensure the necessary competency requirements for related to Stakeholder
competencies for each role each role. – Education Requirements
inside the scope of its QM – There is a documented procedure – Knowledge Specifications
system that is directly or on user role training and allied – Skills: Capability to
indirectly concerned with training materials perform a particular task
- 23 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
AI. – There are records that provide Examples for training records
evidence that the competency are
requirements have been met. – (self) tests
– Artefacts that result from
practicing a particular skill
e.g. documents

3.1.1 Intended use and stakeholder requirements

Table 3: Intended use requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should There is documented specification of The disease or injury is Intended use and MDR (2017/745) Annex
determine the medical – Indication including disease or specified using ICD-10 codes stakeholder requirements II (1.1)
purpose of the medical injury or physiological state (at least 3 digits) specifications ISO 14971:2019 clause
device. – Goal: e.g. diagnosis, treatment, Increasing adherence is an 5.2
monitoring, prevention, elevation example for improving
and / or prognosis treatment.
The manufacturer should – Faster patient care e.g. Intended use and MEDDEV 2.7/1 rev. 4
specify other positive treatment, diagnosis stakeholder requirements MDR (2017/745) Annex
impacts on health care – Reductions inworkload specifications I (23.4)
– Reductions in costs of
healthcare
The manufacturer should There is a documented specification Intended use and MDR (2017/745) Annex
specify the target patients of stakeholder requirements I (23.4)
– Demographics (e.g. age, sex) specifications MDR (2017/745) Annex
– Contraindications II (1.1)
– Co-morbidities IEC 62366-1clause 5.1
The manufacturer should Intended use and IEC 62366-1clause 5.1
specify the intended part stakeholder requirements
- 24 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
of body or type of tissue specifications
the medical device shall
interact with
The manufacturer should There is a description of the task the Typical tasks include Intended use and IEC 62366-1clause 5.1
specify the operating ML-model shall perform. – Segmentation stakeholder requirements MDR (2017/745) Annex
principle There is a specification of the type – Detection specifications II (1.1)
machine learning – Decision support
– Recommendation
– Process automation
– Search (e.g. similarities)
Typical dimensions include:
– Type of learning
(supervised, unsupervised,
semi-supervised,
reinforcement)
– Time and type of learning
(before placing on the
market, during use,
globally, per product
instance, per hospital)
– Technical task
(classification, regression,
clustering, control)

Table 4: Intended users and intended context of use requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
The manufacturer should – There is a list of intended primary User characteristics my Intended use and MDR (2017/745) Annex
- 25 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
characterize the and secondary users include: stakeholder requirements I (5)
intended users. – The characteristics and – Education specifications MDR (2017/745) Annex
prerequisites that each user group – Experience in medical II (1.1)
has to fulfil are specified domain IEC 62366-1clause 5.1
– Technical skill knowledge
– Training to be
accomplished
– Physical prerequisites and
limitations (height, sight,
disabilities)
– Intellectual and mental
prerequisites and
limitations
– Language skills
– Experience with product
type or technology
– Cultural and social
background
The manufacturer should There is a documented specification of The physical environment Intended use and MDR (2017/745) Annex
characterize the intended the might include: stakeholder requirements I (5)
use environment – physical use environment – Brightness, specifications IEC 62366-1clause 5.1
– social use environment – Loudness e.g. alarms
– work environment – Temperature
– Contamination
– Visibility
– Humidity, moisture
The social environment may
include:
– Stress, mental workload
– Shift operation
- 26 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
– Number of people and
frequently changing
colleagues
The work environment may
include
– Typical dress code
– Wearing of gloves or other
personal protection
equipment
– Usage of tools
– Physical stress

Table 5: Stakeholder requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
The manufacturer should – There are documented user Examples of user requirements Intended Use and MDR (2017/745)Annex
operationalize the goals requirements – 95% of radiologists working with Stakeholder Requirements I (23.4)
listed in the intended use – There are documented system detect the cancer Specifications MDR (2017/745) Annex
with quantitative values quantitative performance Examples of performance III (1.1)
for product requirements requirements:
– The system shall have a sensitivity
of 97%
– The system must be able to detect
coronary artery plaques of at least
0.2 mm diameter
The manufacturer should – The minimum hardware Hardware requirements may include Intended Use and ISO 13485 clauses 6.3
specify the runtime requirements are specified – Screen size, resolution and Stakeholder Requirements (infrastructure) and 7.5.1
- 27 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
environment of the – The minimum software orientation Specifications (verification of
product regarding requirements are specified – Physical storage infrastructure
hardware and software – Network connectivity e.g. requirements for
bandwidth, latency, reliability deployment)
– Required peripherals such as EU MDR (2017/745)
printers, scanners, input devices Annex 1, 17.4
Software requirements may include IEC 62304 clause 5.2
– Operating system (incl. version)
– Browser (type, version)
– Virtualization (e.g. Java Runtime
Environment,.NET, Docker, virtual
machines)
The manufacturer should – There is a list of data Protocols might include Intended Use and IEC 62304 clause 5.2
identify and specify the interfaces (can be specified – OSI-Protocols such as TCP/IP, Stakeholder Requirements
data interfaces. in a context diagram as HTTPS Specifications
well) – Bus-Systems such as CAN, USB
– The protocols are specified – Physical hardware connections
– The formats are specified Format might include
– The semantic standards are – File formats (XML, JSON, PDF,
specified docx, CSV, DICOM)
– Image formats (size, resolution,
colour coding)
Semantic standards might include
– Taxonomies e.g. ICD-10, ATC
– Nomenclatures e.g. LOINC
The manufacturer should There is a specification of Input data specifications may include: Intended Use and IEC 62304 clause 5.2
specify the requirements input data – Sensor requirements Stakeholder Requirements ISO 14971:2019 clause
for input data for each – Type of data capturing device Specifications 5.3
inbound data interface.
– Precision of data
- 28 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
– Size / quantity of data
– Type and technical parameters of
recording procedure (e.g. strength
of magnetic field, number of
electrodes,
The manufacturer should – There is a list of countries / The list might include documents such Intended Use and MDR Annex IX (2.2)
determine the regulatory markets that the product as Stakeholder Requirements ISO 13485 (clauses 5.2
requirements. shall be place in. – FDA guidance documents Specifications and 7.2.1)
– There is a list of laws, – Standards (e.g. IEC 62304, ISO
standards, regulations, 13485)
directives, guidance – Laws and regulations e.g. MDR
(2017/745), IVDR

Table 6: Inputs to risk management and clinical evaluation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
The manufacturer should – There is a clinical evaluation Alternative technologies might Risk Management and MEDDEV 2.7/1
evaluate alternatives to the – The clinical evaluation lists include Clinical Evaluation MDR (2017/745)
given product (e.g. other alternative products, technologies – Other ML models ISO 14971:2019 clauses
products, procedures, and / or procedures – Non-ML methods e.g. 4.2 and 10.9
technologies). – There is a search protocol classical algorithms
revealing how the manufacturer Performance parameters might
was searching for alternatives include
– The clinical evaluation assessing – Accuracy
alternatives with respect to – Specificity
clinical benefits, safety / risks,
– Sensitivity
performance
- 29 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
– The alternatives include non-ML – Response times
based technologies – Robustness
– There is a statement confirming
that the product reflects the state-
of-the-art.
The manufacturer should – The risk management file Performance requirements Risk Management and ISO 14971:2019
compile a list of risks contains an analysis of hazards might include Clinical Evaluation clauses5.4 and 5.5
specifically associated and related harms with related – Accuracy EU MDR (2017/745)
with the use of the method probabilities and severities – Specificity Annex 1 (3)
of machine learning. resulting from ML models not
– Sensitivity
meeting the requirements.
– Response times
– There is a FMEA that analyses
the effects of ML models that do – Robustness
not meet the performance
requirements
The manufacturer should The risk management file analyses Non-specified users Risk Management and ISO 14971:2019 clause
analyse the reasonably risks associated with – other profession e.g. nurse Clinical Evaluation 5.2
foreseeable risks. – non-specified user type / category instead of physician
– non-specified use environment – missing training
– application of product for patients Other patients
other than those specified – Different age, sex, race
– reasonably foreseeable misuse – Other co-morbidities
– Different severity of disease
or injury
- 30 -
FG-AI4H-I-036

3.1.2 Product and software requirements

Table 7: Functionality and performance requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
The manufacturer should There is a specification of 'Quantitative quality criteria' may Product and Software IEC 62304 clause 5.2
derive from the quantitative minimum 'quality include the following: Requirements ISO 13485 7.3.3
stakeholder requirements criteria' – for classification problems Specifications
traceable quantitative There is 'traceability matrix' that o accuracy (mean or balanced
quality criteria and links the intended use with accuracy)
requirements for the quantitative quality product o positive predictive value
software and/or the requirements.
algorithm from the (precision)
intended use o specificity and sensitivity
o F1 Score Area under the ROC
curve (AUC)
– for regression problems
o mean absolute error
o mean square error
 Example 1: The stakeholder
requirement states that 95% of
radiologists must be able to detect a
cancer with the product. The
requirement of the algorithm states that
it must display a sensitivity of
97%. Example 2: The stakeholder
requirements state that arterial
calcification must be able to be detected
at a sensitivity of 92%. The
requirements of the algorithm state that
it must be able to exactly predict the
strength of the plaques in the blood to
- 31 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s)
applicable
0.2 mm.
The manufacturer should There is a specification of non- Self-tests can be a mean to verify the Product and Software ISO 13485 clause 7.2.1
derive non-functional functional requirements such as repeatability of a system. Requirements MDR Annex I (17)
requirements from the – Repeatability / Reproducibility The specification of response times Specifications
intended use. – Response times might depend on number of users,
– Availability number of transactions, frequency and
amount of input data etc.
Availability can be expressed as
percentage of time, percentage of
usages or as meantime between failure.
The manufacturer should There is a specification that Product and Software ISO 25010
determine how the describes how the system reacts Requirements IEC 62304 clause 5.2
system behaves if the on Specifications ISO 14971:2019 clause
inputs do not meet the – incomplete data sets 5.4
specified requirements – lack of data sets
– wrong data format
– excessive data quantities
– data outside of specified value
ranges
– wrong temporal sequence of
data, etc.

Table 8: Userinterface requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should There is specification of the user See previous checklist item. Product and Software IEC 62304 clause 5.2
specify what the user interface in case of UI output display modes may Requirements
interface must display in
- 32 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
case of errors. if the inputs – incomplete data sets include the following: Specifications
do not meet the specified – Internal errors – warning
requirements – alert
– caution,
– Meantime between failure,
etc.
The manufacturer should Either there is an Instructions-for- Product and Software MDR (2017/745) Annex
determine whether there is Use (IFU) or the user riskanalysis Requirements I (23)
a need for instructions for reveals no risks that can be further Specifications
use and training materials. mitigated by an IFU

Table 9: Additional software requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The risk analysis considers risk that Examples of interfaces Product and Software MDR (2017/745) Annex
set forth requirements to are caused by internal errors. include: Requirements I (17, 18, 23.4)
detect internal errors The device specification specifies – Data and user interfaces to Specifications IEC 62304 clauses 5.2,
how manufacturers or service audit logs 5.3 and 7.1
technicians can gain access to – Monitoring ports ISO 149781:2019 clause
internal errors. Examples of internal errors are 5.4
– Runtime errors such as null
pointer exception
– Resource overload such as
out of memory errors
– Lack of access to resources
such as databases
– Compromised integrity of
data and program code
- 33 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There are records of processing Product and Software Art. 22 of the GDPR.
justify if the device takes activities. Requirements Exceptions of Art. 22
decisions exclusively – There is a data protection impact Specifications section 2 may apply.
based on automatic data assessment
processing

Table 10: Risk management & clinical evaluation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The risk analysis assesses the risks Invalid/ non-compliant input Product and Software ISO 14971:2019 clause
assess the risks arising if for wrong inputs at each data conditions may include the Requirements 5.4
the inputs do not meet the interface. following: Specifications IEC 62304 clause 7.1
specified requirements The risk analysis considers all – incomplete data sets
relevant types of wrong inputs. – lack of data sets
– wrong data format
– excessive data quantities
– data outside of specified
value ranges
The manufacturer should The clinical evaluation lists The gold standard is not the Product and Software
set the gold standard alternatives. same as alternatives. E.g. the Requirements
against which the quality The clinical evaluation compares gold standard to determine the Specifications
criteria can be reviewed, these alternatives with respect to blood pressure is an invasive
and justify their choice specified quality criteria. measurement. But this is not
There is a documented justification the alternative.
for the selected ground truth.
The manufacturer should There is risk assessment report / risk Product and Software ISO 14971:2019 clause
analyse the risks arising if table that specifies risks in case Requirements 5.4
the outputs do not meet the outputs do not meet the specified Specifications IEC 62304 clause 7.1
specified quality criteria 'quantitative quality criteria'
- 34 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should There is of outputs that an Product and Software TODO (EU framework
assess the consequences if assessment report on consequences / Requirements on AI)
the system provides implications of socially unacceptable Specifications
socially unacceptable / outputs.
discriminatory outputs Assessment report includes:
– Cost estimation for wrong clinical
decision making
– AI autonomy level assignment
and associated risk acceptance
criteria based on criticality of the
clinical use case and environment
The manufacturer should The risk analysis assesses risk arising Product and Software IEC 62304 clause 7.1
assess the risk arising if from Requirements
the system does not meet – Lack of availability / robustness Specifications
the specified non- – Slow response times
functional requirements
– Interoperability problems
With Continuous Learning The risk analysis assesses risks that a Examples of risk mitigation: Product and Software MDR (2017/745) Annex
Systems, the manufacturer specific to continuous learning – option to reset the systems Requirements I (17)
should mitigate risks that systems – self-tests Specifications
are specific to The risk management file specifies
continuously learning the respective risk mitigation.
systems.
With Continuous Learning Analysis report showing a Positive Product and Software ISO 14971:2019 clause 6
Systems, the manufacturer Risk-Benefit Ratio compared to the Requirements
should show quantitatively state-of-the art. The clinical Specifications
why the risk-benefit evaluation compares benefits for
analysis is better than for continuously learning and non-
non-continuously learning continuously learning systems.
systems
- 35 -
FG-AI4H-I-036

3.1.3 Data management requirements

Table 11: Data collection requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
The manufacturer should – There is a specification of number ISO 13485 clause 7.3.7
specify the number of of data sets.
test data sets. – There is a rationale for this number.
The manufacturer should – There is a specification of technical Technical inclusion / exclusion Data Management TODO: Extract from
specify the inclusion and requirements criteria may include for each AI/ML standards
exclusion criteria for – There is a specification of patient attribute
individual data sets attributes that have to be met to – Data ranges
include a data set. – Data type (numeric (float,
integer etc.), ordinal,
categorical, String / text, date /
time, image / binary)
– Data formats (e.g. date and
number formats)
– Unit of measure
– Precision of numbers
– Attributes values
– File formats / types
– Sampling rates
– Image parameters such as
compression, images size,
resolution, colour coding, zoom
Inclusion / Exclusion criteria of
patient data may include the
following attributes:
– demographic data (age, gender)
– physical parameters (height,
weight)
- 36 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
– diseases
– vital parameters
– lab parameters
– presence of additional tests
– case history
– Special conditions (e.g. patients
having heart pacemaker or lung
surgery)
The manufacturer should – There is a list of allowed / expected Data sources may include TODO: Extract from
specify the data source data sources – Medical devices AI/ML standards
requirements – There is a specification of data – In-vitro diagnostic devices
source requirements – Questionnaires
– There is a validation of surveys – Cameras
(justify the selection of the surveys,
– Electronic patient records
the time of survey and possibly the
method for their assessment, in Examples for input requirements:
particular if no standardized survey – with or without contrast agent
exists.) (MRT, CT)
– Survey methods may include the – number of electrodes (ECG)
type of questions, the types of – Voltage (X-Ray, CT)
answers, the decision to have open – Position of patient
or closed questions etc
The manufacturer should – There is an analysis of factors Factors causing biases include: TODO: Extract from
specify the distribution causing a bias – Non representative patient AI/ML standards
of input data – There is a specification of the population
distribution of relevant patient – Attributes that are irrelevant for
characteristics the expected output
– Confusion of correlation and
causation
– Specific data sources, type and
location of data collection
- 37 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
Even if all individual data sets meet
the specification, still the
distribution of data might not be
representative and/or cause a bias.
The manufacturer should – There is a description how it is Descriptive statistic may include TODO: Extract from
validate that the test and ensured that data sets that do not the following: AI/ML standards
training data meet the meet the inclusion criteria are – calculation of distributions
specified criteria actually excluded (histograms),
– there is a descriptive statistic – mean / average values,
– there is a justification that the data – quartiles,
are representative for the target – joint distribution of features,
population correlation, etc.
– there is an analysis of a potential
"label leakage"
Label leakage examples include:
– in the sorting (e.g. first the data
of healthy persons, then of ill
persons),
– in the hospital (e.g. if the severe
cases originate from just one
institution),
– in images (e.g. for skin cancer,
one must always see a ruler)
The manufacturer should – There is a documented patient data Data could be derived from GDPR
ensure data protection. protection policy machine-to-machine (M2M)
– There should be a documented communication as well
procedure for data anonymization /
pseudonymization
– Data scientists do not have access to
protected data
– There is a data protection officer
- 38 -
FG-AI4H-I-036

Table 12: Data annotation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer using There is specification for Data Management
"supervised learning" "Label"selection criteria in case of
should derive the labels "supervised learning "based machine
from the intended use and learning task
justified this selection
The manufacturer using – The procedure describes how the If, for example, patients have ISO 13485 clause 4.1
"supervised learning" ground truth is derived to be classified as healthy and
should has a procedure to – The procedure specifies sick, the manufacturer must
ensure correct labelling quantitative classification / derive the criteria specifically
segmentation criteria for labelling for the intended use, when a
– There is a justification of these patient is to be classified as
criteria healthy and when as sick.
– The procedure specifies how and
how frequently the correctness of
labelling is monitored
– The procedure specifies how to
deal with inconsistency of data
annotation from multi-annotators
The manufacturer should – There is specification for the The results of the monitoring Data Management ISO 13485 clause 6.2 and
ensure the competency of number of people recruited for of the labelling can be used to 7.3.2
persons responsible for "labelling" task continuously verify the fitness
labelling – There is description of the of persons responsible for
training to be given to persons labelling
responsible for 'labelling'
– There is specification for the
competency level of persons
responsible for 'labelling'
– There is a procedure for assessing
the success of training success
- 39 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
and of the competency for
persons responsible for 'labelling'
– There are respective records

Table 13: Data pre-processing requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should There is documented procedure for Data pre-processing stepsmay Data Management ISO 13485:2016
set a procedure that data pre-processing: include the following: clause4.1
describes the pre- – This procedure describes how the – conversion,
processing of the data correctness of the interim steps – transformation,
before data is used to train and the final results are assessed – aggregation,
or test the model through risk-based evaluations
– normalization,
– This procedure specifies how
– format conversion,
values with various measurement
scales or units are detected and – calculation of feature,
processed – conversion of numerical
– This procedure specifies how data into categories, etc.
values are detected and processed
that have been collected with "Missing value"problem
various measurement methods includes"missing at random"
– This procedure specifies how and "missing not at random"
missing values within data sets
are detected and processed "Missing value"processing
– This procedure specifies how techniques include:
unusable data sets are detected – deleting the data set
and handled as per the data – replacement by the average
inclusion and exclusion criteria value of other data sets
– new value "missing" (for
- 40 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
categorical values), etc.

"Outliers" processing
techniques include
– deleting the data set
– correcting the value
– setting the value to a set
value (min/max),etc.
Examples of unusable datasets
may include:
– x-rays of poor quality as
specified in the technical
exclusion criteria or
patients/‌persons who do not
meet the patient inclusion
criteria. etc.

Table 14: Documentation and version control requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There is a list of data sources The description of data sources Data Management
trace data back to data – There is a document describing might include
source. the data processing steps – Location (e.g. clinic)
– There is a specification of rules – Capture device
for data inclusion and exclusion
– There is a rationale if additional
data have been excluded or if data
have been kept despite meeting
the specification
- 41 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There is a list of all software Means to identify a software Data Management ISO 13485 clause 4.1
document all software for applications are
data processing – All applications are clearly – Manufacturer
identified – Name of software
– It is identifiable if the software is – Version of software
off-the shelf or individually
developed
The manufacturer should – There is a policy (e.g. SOP) IEC 62304 clause 8
put all software under specifying the configuration and
version control version control process
– There are records demonstrating
that the software actually is under
version control
– The software libraries and
frameworks are identified and
under version control

3.1.4 Model development requirements

Table 15: Model preparation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There is a list of features Dependencies can be Model Development TODO: Extract from
deliberately select the – There is a rationale as to why a visualized e.g. with a directed AI/ML standards
features for training feature is taken into account Acyclic Graph (DAG)
– There is a description of the
dependency of the features among
each other
The manufacturer should – There is justification for the ratio E.g. for data with rare features TODO: Extract from
- 42 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
deliberately divide the data of training, validation and test or labels, it may be necessary AI/ML standards
into training, validation data to distribute the data not just at
and test data – There is a documented random
stratification for dividing up the
data in to training, validation and Example for an object can be a
test data CT scan. The images of one
– There is documentation that series should not be distributed
reveals how multiple data sets for into the three different
an object are in the same "bucket" "buckets".
(training, validation and test
data).
– There is a justification if data are
not distributed at random
The manufacturer should – There is a role-based policy for Model Development TODO: Extract from
document how it ensures data access AI/ML standards
that the development team – There is a description how the
has no access to the test development team is prevented
data from gaining access to test data

Table 16: Model training requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – There is a document that Examples of this are ISO 13485 clause 4.1
document model specific describes which feature have been normalization, selection of
data processing recoded specifically for a model class labels (e.g. 0 or 1),
or technology selection of column names,
distribution of categorical
values over multiple columns.
If there are several quality – There are one or more quality – TODO: Extract from
- 43 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
metrics, the manufacturer metrics identified and respective AI/ML standards
should document the target values specified
quality metrics for the – There is a documented rationale
model to which it wants to how these quality metrics relate to
optimize the model and the intended use.
justified it based on the
intended use.
The manufacturer should – There is a policy forbidding the Visualization (e.g. learning TODO: Extract from
avoid overfitting use of test data to optimize the curves) might be helpful for AI/ML standards
model (only training and test data justification and to illustrate
may be used) the impact of hyperparameter
– There is a justification of the and epochs on quality metrics
choice of the hyperparameters
– There is a justification of the
choice of epochs

Table 17:Model evaluation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
The manufacturer should – There is model validation report A residual analysis in which the errors MDR (2017/745)
evaluate the correctness – There is a test specification and are listed via the feature values. Annex I (17), Annex II
and robustness of the test results for the final (6.1)
model evaluation of the model with For classification tasks, the model is IEC 62304 clauses
new test data particularly insecure with probabilities 5.5ff.
– There are documented values for around 0.5. ISO 13485 clauses
specified quality metrics 7.3.4 ff.
– There is an analysis of datasets This is referred to as
that have been predicted "Counterfactuals". This, however,
particularly well and not well.
- 44 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
– There is an analysis of datasets depends on the ML method and cannot
that have been predicted be demanded as a general best
securely and particularly practice.
insecurely
– There is an evaluation on how Approaches include LIME (Local
and how strongly individual Interpretable Model-agnostic
features had to change for the Explanations), Beta (Black Box
model to come to another Explanations through Transparent
prediction. Approximations), LRP (Layer-wise
– For individual data sets there Relevance Propagation) and Feature
MAY be an evaluation of the Summary Statistics (incl. Feature
feature that the model Importance and Feature Interaction).
particularly determined in the This, however, depends on the ML
decision method and cannot be demanded as a
– There MAY be an general best practice.
analysis/visualization of the
dependency (strength, direction) Examples of Sharpley-Values, ICE-
of the prediction of the feature Plots, Partial Dependency Plots (PDP).
values This, however, depends on the ML
– There MAY be a synthetization method and cannot be demanded as a
of data sets that activate the general best practice.
model particularly strong
– There MAY be an Examples of synthetization can be
approximation of the model found here
using a simplified surrogate http://yosinski.com/deepvis. This,
model however, depends on the ML method
and cannot be demanded as a general
best practice.

A decision tree is an example for a


surrogate model. This, however,
depends on the ML method and cannot
- 45 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
be demanded as a general best
practice.
The manufacturer should There is a documentation of various Example for quality metrics see above.
justify its ultimate models that have been compared.
selection of the model There is a comparison of these
(architecture, models (architectures)
hyperparameter) There is a comparison of different
sets of hyperparameter
The comparison includes quality
metrics
The comparison not only assesses
the quality metrics globally, but
also separately for various features.
There is a risk-benefit assessment
that discusses interpretability,
performance (e.g. quality metrics,
efficiency) and robustness

Table 18: Model documentation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
The manufacturer should – There is a documentation of the model A way to document models TODO: Extract from
document the model (architecture) are the 'model card / sheet'that AI/ML standards
– There is a documentation of the selected includes:
hyperparameters – model version
– There is a documentation of used – assumptions, constraints,
software libraries and frameworks (also dependencies on the
- 46 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
SOUPs) algorithm used
– There is a documentation of the quality – current performance
metrics and of the evaluation results e.g. figures
of performance and robustness as – expected / optimal
specified in Table 17: Model evaluation performance
requirements – major risk conditions

ML models included
– Linear Regression
– Logistic Regression
– k-nearest neighbours
– Decision Trees
– Random Forest
– Gradient Boosting
Machines
– XGBoost
– Support Vector Machines
(SVM)
– Neural Network
– k means clustering
– Hierarchical clustering
– Neural Network including
Convolutional Neural
Network (CNN),
Recurrent Neural
Networks (RNNs) and
Long Short-Term Memory
Networks (LSTMs)
– Apriori algorithm
– Eclat algorithm
- 47 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle Standards /


phase Regulations
applicable
– Stacked Auto-Encoders
– Deep Boltzmann Machine
(DBM)
– Deep Belief Networks
(DBN), etc.
The manufacturer should – There is a SOP for document e.g. trained models can be Model Development ISO 13485 clauses
apply version and respectively version and configuration serialized 7.3.10, 7.5.9.1
configuration control to control
development artefacts. – The following artifacts are (additionally
to software code and libraries) under
version control:
o Configuration files, hyperparameters
o Test and evaluation results (including
quality metrics)
o Software libraries and frameworks

3.1.5 Product development requirements

Table 19: Software development requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should – adhere to the normal best Product Development IEC 62304
perform and document the - There is software practices such as adherence
required activities pursuant development plan to coding guidelines,
to IEC 62304. - If the model is implemented in – review of code by code
another programming reviews using defined
- 48 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

language or for another criteria,


runtime environment, the plan – testing to code with unit
defines which activities of tests with a defined
coverage, etc.
model development have to be
repeated
- There is a verification plan
that requires software system
tests
- The software safety class
(alternatively Level of
Concern) is determined
- There is a software
requirement specification
(SRS)
- The SRS specifies user
interface related requirements
- There is a documented
software architecture
The manufacturer should Performance may include: Product Development MDR (2017/745) Annex
test software on the target - The test hardware is specified II 6.1
hardware - The test hardware is - response times,
representative for the target
hardware
- resource consumption

- The tests verify whether the Hardware may include


specified performance
requirements are met - browser,
- mobile device, etc.
- 49 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should Components can be Product Development IEC 62304 clauses 5.3
identify and verify all - there is a list of all SOUP / uniquely identified by and 8.1.2
SOUP (respectively OTS) OTS components
components - Each SOUP / OTS component - Manufacturer
is uniquely identified
- Name of component
- Each SOUP / OTS component
is under version control - Version of component
- The requirements for each
Traces can be documented
SOUP / OTS component are
specified using ALM tools or tables
- The is a documented trace Examples for prerequisites
between these requirements are
and respective tests
- The prerequisites for each - Hardware (e.g. processor
SOUP /OTS component are architecture, RAM)
specified
- Software (e.g. operating
system, run-time
environments e.g. .NET,
Browser)

Table 20: Accompanying materials requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The identification of the MDR (2017/745) Annex
provide instructions for - There are instructions for use product should be achieved by I (23.4)
- 50 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
use the product’s UDI-DI
- The instructions for use
clearly identify the version of
the product
- There is a procedure
specifying how to develop and
verify instructions for use
- The instructions for use and
under version control
The instructions for use The medical purpose and MDR (2017/745) Annex
should describe the - The instructions for use I (23.4)
specify the intended medical benefit typically are related
intended purpose and
intended use purpose and medical benefit to diagnosis, treatment,
prognosis and monitoring of
- The instructions for use certain diseases or injuries.
specify the intended patient The patient population can
population including be characterized by age,
indications, contraindications gender or accompanying
and if relevant other diseases
parameters
Examples for input data
- The instructions for use requirements are
specify the patients / data / use
case for which the product
- Formats
may not be used. - Resolutions

- The instructions for use - value ranges, etc.


specify the requirements of
the input data

- The instructions for use


specify the intended primary
- 51 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

and secondary users pursuant


to intended use.

- The instructions for use


describe the other conditions
applicable to the product (e.g.
runtime environment, use
environment).

- The instructions for use


describe how to update the
product
The instructions for use Examples of quality metrics MDR (2017/745) Annex
should specific the - The instructions for use I (23.4)
specify the quality metrics. are specificity, sensitivity,
performance of the product
precision
The instructions for use MDR (2017/745) Annex
should explain the product - The instructions for use I (23.4)?
and its inner workings. indicate the data with which TODO: Reference to
the model was trained. AI/ML standards

- The instructions for use


describe the model and
algorithms.

- The instructions for use


specify whether the product is
further trained during use.
The instructions for use Example for negative Product Development EU MDR (2017/745)
should reveal residual risks - The instructions for use list Annex 1 23.4
the factors that could have a factors are
ISO 14971:2019 clause 8
negative effect on the - Patient population
- 52 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

product’s performance deviating from specified


population
- The instructions explain risks
arising from a product not - Data not meeting the
meeting the performance specified criteria (e.g.
requirements formats, value ranges)

- The instructions for use list


possible ethical problems.
The instructions for use EU MDR (2017/745)
should further information - The instructions for use Annex 1 23.4
that is legally required identify the manufacturer EU Regulation 207/2012
- The instructions for use lists
channels for posing questions

- The instructions for use


contain references to licensing
rights.

- The instructions for use


contain the URL under which
the most current versions of
the instruction of use can be
found
- 53 -
FG-AI4H-I-036

3.1.6 Product validation requirements

Table 21: Usability validation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The product’s visual output IEC 62366-1 clause 4.1
identify risk arising from a - The risk management file lists
risks, that arise from includes
lack of usability
misunderstanding, - Displaying information
overlooking or ignoring the at the user interface
product’s visual output

- The risk management file lists


- Reports, printouts
The manufacturer could
risks, that arise from users evaluate how obvious the
blindly trusting or mistrusting systems output is before users
the product. become suspicious.
The manufacturer should Product Validation IEC 62366-1
assess whether the users - The risk management file lists (instructions for use are
understand the instructions the risks, that have to be considered to be part of
for use. mitigated by instructing users accompanying
e.g. by training or documentation that is
accompanying materials. considered to be part of
the user interface)
- The plan of the summative
evaluation describes how the
effectiveness of these
measures is validated.

- The usability evaluation report


reveals whether the
instructions for use
areadequate to mitigate risks.
- 54 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should IEC 62366-1 clause 5.4
evaluate all safety relevant - There is a list of use scenarios.
use scenarios.
- There is an assessment of
safety relevance for each use
scenario

- The use scenarios included in


the summative evaluation
cover all safety relevant use
scenarios.

- The summative evaluation


evaluates the effectiveness
cover all risk mitigation
measure.

Table 22: Clinical evaluation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The data are typically Product Validation EU MDR (2017/745)
assess whether the - The clinical evaluation Article 61
contains the medical benefits clinical data
promised medical benefit
is achieved with the the manufacturer claims.
quality parameters. The technical equivalence has
- The clinical evaluation lists to consider the software
the data (sources) that have algorithms.
been evaluated and that
support and that contradict the
hypothesis, that the benefits
- 55 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

have been achieved.

- If the data have been collected


from other products, than the
clinical evaluation discusses
the clinical and technical
equivalence of the other
products.

- The clinical evaluation


evaluates the impact of quality
parameters on the
achievement of the medical
benefit.
The manufacturer should Alternative approaches Product Validation EU MDR (2017/745)
assess whether the - The clinical evaluation lists Article 61
alternative methods, include
promised medical benefit
is achieved is consistent technologies or procedures - Not continuously
with the state of the art.
- The clinical evaluation learning model in
compares the risks and comparison to a
benefits of these alternatives continuously learning
model

- Classic algorithm in
comparison to a machine
learning model.
- 56 -
FG-AI4H-I-036

3.1.7 Product release requirements

Table 23: Product release requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should The documentation of the MDR (2017/745)
verify the completeness of - There is a risk management model should at least cover all Annexes I and II
documentation. report concluding that all risk aspects that have been mention
management related activities in chapter “instructions of
have been performed use”.
according to risk management
plan and that residual risks are
acceptable.

- There is a usability evaluation


report concluding that all
activities to formative and
summative evaluation plan
have been performed.

- There is a documentation of
the model.
If the manufacturer plans Product Release FDA: Proposed
to market its product in the - There is a "Software as a Regulatory Framework
US market it should Medical Device Pre- for Modifications to
compile the respective Specifications “(SPS) that Artificial
documentation. anticipates changes to the Intelligence/Machine
product. Learning (AI/ML)-Based
Software as a Medical
- There is an “Algorithm Device (SaMD)
Change Protocol (ACP)”that
specifies howthese changes
for systems will be performed
- 57 -
FG-AI4H-I-036

3.2 Post-market requirements

3.2.1 Production, distribution & installation requirements

Table 24: Production, distribution and installation requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
The manufacturer should - There is a SOP or work - The bill of material also IEC 62304 clause 8
apply version- and FDA Cybersecurity
instruction that specifies how contains all
configuration control Guidance
the manufacturer identifies SOUP/OTS Software
artefacts and how it ensures ISO 13485:7.5.8
how the correct artefacts in - In EU and in the US,
the respective version are there is typically the
delivered need for a UID-DI and
UDI-PI
- Version and configuration
applies to the software as
well to accompanying
materials such as instructions
for installation and use
- There is a bill of materials
- There is a unique
identification (ID) of the
product
The manufacturer should - There is a SOP or work The specification of the ISO 13485 clause 7.3.8
ensure the design transfer production runtime
instruction that specifies how
environment can include
the persons responsible for
installation know which is - Hardware (CPU, RAM)
the most current version and
how mistakes in installation - Monitors, displays (size,
can be ruled out resolution, orientation)
- 58 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

- There is are instructions for - Operating system


installation, update and
decommissioning
- These instructions specify the
runtime environment
- There instructions specify how
the correct installation can be
verified
The manufacturer should - There is a SOP covering Production, Distribution, ISO 13485 clauses 5.2
ensure effective and Installation and 7.2
customer communication
efficient communication MDR (2017/745) Annex
including handling of
with operators and users. I (23.1)
customer complaints
- There is a website that
contains information about
latest product releases
- The website provides the
means to download the
software
- The instructions for use
reference this website
- The instructions for use and
the website reveal contact
information e.g. e-mail,
phone number, and/or a
contact form
- 59 -
FG-AI4H-I-036

3.2.2 Post-market surveillance requirements

Table 25:Post-market surveillance requirements

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
When determining - There is an analysis, whether - Example for feedback Post-Market Surveillance MDR (2017/745) Annex
threshold values the III (1.1)
feedback loops can influence loops: A travel
manufacturer should
input values recommendation app
analyse how the
sends targeted
application of the product - There is an analysis, whether advertising depending
might impact feature self-fulfilling prophecies can on feature (last trip).
(input values) influence input values. This influences travel
- There is a specification of behavior.
threshold values in the post- - Example for feedback
market surveillance plan loop: An algorithm
provides prognoses.
Therefore, the
physician will treat
the patients better or
earlier...
- Example for self-
fulfilling prophecies:
an algorithm for
predicting date and
location of crimes
will cause a higher
surveillance by
police. This will
cause an increased
number of detected
crimes
The manufacturer should - “By whom” not only Product Release MDR (2017/745)
compile a Post-Market - There is a SOP specifying how
persons / roles, but Article 83
- 60 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
Surveillance Plan to compile post-market also systems can be EU MDR Annex III (1.1)
listed
surveillance plans
- Examples for
- There is a post-market additional quality
surveillance plan specifically metrics see above.
for the product Also, the variance of
these quality metrics
- The plan lists all data sources over time might be a
to be monitored quality metric (This
allows visualization
- The data sources include or quantification in
scientific literature, customer particular for non-
communication (e.g. normally distributed
complaints), IT security data over the
comparison of
databases, bug reports and
histograms or core
release notes for SOUP / OTS, density estimations)
databases of authorities (e.g.
FDA) - Actions might include
re-evaluation of risk-
- The plan describes for each benefit analysis, re-
data source how, how often and training of algorithm,
by whom data are collected product recall

- The plan specifies how data has


to be analysed

- The plan requires that quality


metrics such as sensitivity and
specificity are monitored

- The plan specifies the data to


be collected to be able to
- 61 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

analyze whether the data in the


field is consistent with the
expected data or training data

- The plan lists threshold values


that trigger actions

- The threshold values include


quality metrics

- These threshold values include


features

- The plan specifies the


frequency and content of
compiling post-market
surveillance reports

- The plan is approved


The manufacturer should – There is a post-market surveillance MDR (2017/745) Article
perform post-market report for each product respectively 85 f.
surveillance and compile product type
reports, both according to – The post-market surveillance
the post-market reports clearly identifies the
surveillance plan. respective products via its UDI
– The post-market surveillance
reports identify the post-market
data and conclude whether
activities are required
The manufacturer should - It is possible to ISO 14971 clause 10
establish a post-market - There is a specification how,
how often and by whom the combine post-market
risk management system
risk management and
- 62 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable

state of the art is monitored and post-market


surveillance
re-assessed
- The interpretability
- The state-of-the-art assessment includes transparency
takes latest algorithms for and explainability
machine learning and for
improving interpretability into
- The forseeable misuse
may include
account
radiologists that rely
- The state-of-the-art assessment on the software and
don't look at the
takes alternatives for the images anymore, so
“ground-truth” respectively the they overlook finding
gold standard
- The foreseeable misuse
- There is a specification how, can include users or
how often and by whom post- operators not
market data are evaluated for updating the software
new or changed hazards, or using the product
after communicated
hazardous situations, and risks
end of life.
- The post-market risk analysis
searches for (adverse)
behavioural changes or
(foreseeable) misuse

- For products that have been


placed on the market for more
than one year post-market risk
management activities are
documented
The manufacturer, should – For products marketed in the US Descriptions of design MDR (2017/745) Article
assess the design change there is an Algorithm Change changes take into account 87 ff.
- 63 -
FG-AI4H-I-036

Requirement(s) Checklist item(s) Checklist examples Product lifecycle phase Standard(s) /


Regulation(s) applicable
before deciding whether Protocol (ACP) and a "SaMD Pre- changes to ISO 13485 clause 7.3.9
notified bodies Specifications" (SPS) FDA's "Proposed
respectively authorities – There is a description of design
- Intended use
Regulatory Framework
have to be informed changes - ML architecture for Modifications to
– There is an impact analysis for Artificial Intelligence /
these design changes - Software architecture Machine Learning
(AI/ML) Based Software
- Use of 3rd party
as Medical Device
libraries (SOUP,
OTS)
- Programing language
- User Interface
including warning
- Data interfaces
Annex A:
IMDRF Essential Principles
IMDRF- Essential principles provide broad, high-level, criteria for design, production, and
postproduction throughout the life-cycle of all medical devices and IVD medical devices, ensuring
their safety and performance
IMDRF Essential Principles (EPs) were evaluated to cover aspects considered applicable to the
regulation of SaMDs. Main IMDRF references include the following:
1. "Essential Principles of Safety and Performance of Medical Devices and IVD Medical
Devices", IMDRF Good Regulatory Review Practices Group, IMDRF GRRP WG/N47
FINAL, 31 October 2018 (http://www.IMDRF.org/docs/IMDRF/final/technical/IMDRF-tech-
181031-grrp-essential-principles-n47.pdf)
2. Table for use in mapping IMDRF Essential Principles (31 October 2018) to controls for
Artificial Intelligence and Machine Learning Algorithms utilized in Medical Technology
The scope of Epsapplicable to AI-MDs cover the following:
A. Safety and Performance of Medical Devices – General Essential Principles
B. IMDRF Essential Principles Applicable to all Medical Devices and IVD Medical Devices
– General
– Clinical Evaluation
– Medical Devices and IVD Medical Devices that Incorporate Software or are Software as
a Medical Device
– Medical Devices and IVD Medical Devices with a Diagnostic or Measuring Function
– Labelling
– Protection against the Risks posed by Medical Devices and IVD Medical Devices
intended by the Manufacturer for use by Lay Users
C. Essential Principles Applicable to IVD Medical Devices
– Performance characteristics
Details on the Essential Principles and their mapping to AI4 concepts are given below.
NOTE: EP#: refers to original section numbers in the document- "Essential Principles of Safety and
Performance of Medical Devices and IVD Medical Devices", IMDRF Good Regulatory Review
Practices Group, IMDRF GRRP WG/N47 FINAL, 31 October 2018.
Table A.1: IMDRFEP 5.1 – General

EP # EP requirements EP key concepts


5.1.1 Medical devices and IVD medical devices should achieve the performance intended by their manufacturer and Performance; Intended
should be designed and manufactured in such a way that, during intended conditions of use, they are suitable for conditions of use; Safety;
their intended purpose. They should be safe and perform as intended, should have risks that are acceptable when Perform as intended;
weighed against the benefits to the patient, and should not compromise the clinical condition or the safety of Acceptable risks; Patient
patients, or the safety and health of users or, where applicable, other persons. benefits; Health
5.1.2 Manufacturers should establish, implement, document and maintain a risk management system to ensure the Risk management system;
ongoing quality, safety and performance of the medical device and IVD medical device. Risk management should be Quality; Safety; Performance;
understood as a continuous iterative process throughout the entire lifecycle of a medical device and IVD medical Continuous, iterative risk
device, requiring regular systematic updating. In carrying out risk management manufacturers should: management; MD life cycle
5.1.2 a) establish and document a risk management plan covering each medical device and IVD medical device; Risk management plan
5.1.2 b) identify and analyse the known and foreseeable hazards associated with each medical device and IVD medical Identify and analyse hazards
device;
5.1.2 c) estimate and evaluate the risks associated with, and occurring during, the intended use and during reasonably Risk; Intended use; Foreseeable
foreseeable misuse; misuse
5.1.2 d) eliminate or control the risks referred to in point (c) in accordance with the requirements of points 5.1.3 and 5.1.4 Risk elimination; Risk control
below;
5.1.2 e) evaluate the impact of information from the production and postproduction phases, on the overall risk, benefit-risk Continuous, iterative risk
determination and risk acceptability. This evaluation should include the impact of the presence of previously management
unrecognized hazards or hazardous situations, the acceptability of the estimated risk(s) arising from a hazardous
situation, and changes to the generally acknowledged state of the art.
5.1.2 f) based on the evaluation of the impact of the information referred to in point (e), if necessary, amend control Continuous, iterative risk
measures in line with the requirements of points 5.1.3 and 5.1.4 below. management; Update control
measures
5.1.3 Risk control measures adopted by manufacturers for the design and manufacture of the medical device and IVD Risk control measures; Safety
medical device should conform to safety principles, taking account of the generally acknowledged state of the art. principles compliance; State of
When risk reduction is required, manufacturers should control risks so that the residual risk associated with each the art; Risk control
hazard as well as the overall residual risk is judged acceptable. In selecting the most appropriate solutions,
manufacturers should, in the following order of priority:
5.1.3 a) eliminate or appropriately reduce risks through safe design and manufacture; Safe design
- 66 -
FG-AI4H-I-036

EP # EP requirements EP key concepts


5.1.3 b) where appropriate, take adequate protection measures, including alarms if necessary, in relation to risks that Alarms; Risks that cannot be
cannot be eliminated; and eliminated
5.1.3 c) provide information for safety (warnings/precautions/contra-indications) and, where appropriate, training to users. Alarms; User training
5.1.4 The manufacturer should inform users of any relevant residual risks. Residual risk information for
user
5.1.5 In eliminating or reducing risks related to use, the manufacturer should: Risk reduction

5.1.5 a) appropriately reduce the risks related to the features of the medical device and IVD medical device and the Risk reduction; Intended usage
environment in which the medical device and IVD medical device are intended to be used (e.g. ergonomic/usability environment
features, tolerance to dust and humidity) and
5.1.5 b) give consideration to the technical knowledge, experience, education, training and use environment and, where Consider user knowledge
applicable, the medical and physical conditions of intended users.
5.1.6 The characteristics and performance of a medical device and IVD medical device should not be adversely affected to Stress resistance; Intended use;
such a degree that the health or safety of the patient and the user and, where applicable, of other persons are Expected life of device
compromised during the expected life of the device, as specified by the manufacturer, when the medical device and
IVD medical device is subjected to the stresses which can occur during normal conditions of use and has been
properly maintained and calibrated (if applicable) in accordance with the manufacturer's instructions.
5.1.7 Medical devices and IVD medical devices should be designed, manufactured and packaged in such a way that their -
characteristics and performance, including the integrity and cleanliness of the product and when used in accordance
with the intended use, are not adversely affected by transport and storage (for example, through shock, vibrations,
and fluctuations of temperature and humidity), taking account of the instructions and information provided by the
manufacturer. The performance, safety, and sterility of the medical device and IVD medical device should be
sufficiently maintained throughout any shelf-life specified by the manufacturer.
5.1.8 Medical devices and IVD medical devices should have acceptable stability during their shelf-life, during the time of Stability; Shelf life
use after being opened (for IVDs, including after being installed in the instrument), and during transportation or
dispatch (for IVDs, including samples).
5.1.9 All known and foreseeable risks, and any undesirable side-effects, should be minimized and be acceptable when Risk; Side-effects
weighed against the evaluated benefits arising from the achieved performance of the device during intended
conditions of use taking into account the generally acknowledged state of the art.
- 67 -
FG-AI4H-I-036

Table A.2: IMDRFEP 5.2 – Clinical evaluation

EP # EP requirements EP key concepts


5.2.1 Where appropriate and depending on jurisdictional requirements, a clinical evaluation may be required. A Clinical evaluation; Benefit-risk
clinical evaluation should assess clinical data to establish that a favourable benefit-risk determination determination; Clinical investigation
exists for the medical device and IVD medical device in the form of one or more of the following: report; Published scientific literature;
– clinical investigation reports (for IVDs, clinical performance evaluation reports) Clinical experience
– published scientific literature/reviews
– clinical experience
5.2.2 Clinical investigations should be conducted in accordance with the ethical principles that have their origin Ethicalprinciples; Declaration of Helsinki
in the Declaration of Helsinki. These principles protect the rights, safety and well-being of human Rights; Safety; Well-being; Pre-study
subjects, which are the most important considerations and shall prevail over interests of science and protocol review; Informed consent;
society. These principles shall be understood, observed, and applied at every step in the clinical Leftover specimen
investigation. In addition, some countries may have specific regulatory requirements for pre-study
protocol review, informed consent, and for IVD medical devices, use of leftover specimens.

Table A.3: IMDRFEP 5.8 – Medical devices and IVD medical devices that incorporate software or are software as a medical device

EP # EP requirements EP key concepts


5.8.1 Medical devices and IVD medical devices that incorporate electronic programmable systems, including Electronic programmable systems;
software, or are software as a medical device, should be designed to ensure accuracy, reliability, precision, Software; Software as a medical device;
safety, and performance in line with their intended use. In the event of a single fault condition, appropriate Accuracy; Reliability; Precision; Safety;
means should be adopted to eliminate or appropriately reduce consequent risks or impairment of Performance; Single fault conditions; Risk
performance. reduction
5.8.2 For medical devices and IVD medical devices that incorporate software or are software as a medical State of the art; Principles of development
device, the software should be developed, manufactured and maintained in accordance with the state of the life cycle (e.g., rapid development cycles,
art taking into account the principles of development life cycle (e.g., rapid development cycles, frequent frequent changes, the cumulative effect of
changes, the cumulative effect of changes), risk management (e.g., changes to system, environment, and changes); Risk management (e.g., changes
data), including information security (e.g., safely implement updates), verification and validation (e.g., to system, environment, and data);
change management process). Information security (e.g., safely
implement updates); Verification;
Validation; Change management process
- 68 -
FG-AI4H-I-036

EP # EP requirements EP key concepts


5.8.3 Software that is intended to be used in combination with mobile computing platforms should be designed Mobile computing platforms; Size;
and developed taking into account the platform itself (e.g. size and contrast ratio of the screen, Contrast ratio of the screen; Connectivity;
connectivity, memory, etc.) and the external factors related to their use (varying environment as regards Memory; External factors related to their
level of light or noise). use (varying environment as regards level
of light or noise)
5.8.4 Manufacturers should set out minimum requirements concerning hardware, IT networks characteristics and Minimum requirements; Hardware; IT
IT security measures, including protection against unauthorized access, necessary to run the software as networks characteristics; IT security
intended. measures; Protection against unauthorized
access
5.8.5 The medical device and IVD medical device should be designed, manufactured and maintained in such a Cybersecurity; Protection against
way as to provide an adequate level of cybersecurity against attempts to gain unauthorized access. unauthorized access

Table A.4: IMDRFEP 5.10 – Labelling

EP # EP requirements EP key concepts


5.10.1 Medical devices and IVD medical devices should be accompanied by the information needed to Information [Manual]; Safety;
distinctively identify the medical device or IVD medical device and its manufacturer. Each medical device Performance; Easily understood
and IVD medical device should also be accompanied by, or direct the user to any safety and performance
information relevant to the user, or any other person, as appropriate. Such information may appear on the
medical device or IVD medical device itself, on the packaging or in the instructions for use, or be readily
accessible through electronic means (such as a website), and should be easily understood by the intended
user.

Table A.5: IMDRFEP 5.12 – Protection against the risks posed by medical devices and IVD medical devices intended by the manufacturer for
use by lay users

EP # EP requirements EP key concepts


5.12.1 Medical devices and IVD medical devices for use by lay users (such as self-testing or near-patient testing Lay user; Self-testing; Intended use; Usage
intended for use by lay users) should be designed and manufactured in such a way that they perform variations (user technique, usage
- 69 -
FG-AI4H-I-036

EP # EP requirements EP key concepts


appropriately for their intended use/purpose taking into account the skills and the means available to lay environment); Instructions; Easy to
users and the influence resulting from variation that can be reasonably anticipated in the lay user's understand; Easy to apply
technique and environment. The information and instructions provided by the manufacturer should be easy
for the lay user to understand and apply when using the medical device or IVD medical device and
interpreting the results.
5.12.2 Medical devices and IVD medical devices for use by lay users (such as self-testing or near-patient testing Lay user; Self-testing; Near-patient testing
intended for use by lay users) should be designed and manufactured in such a way as to:
5.12.2 a) ensure that the medical device and IVD medical device can be used safely and accurately by the Safety; Accuracy; Instructions; Risk
intended user per instructions for use. When the risks associated with the instructions for use cannot be reduction; Training
mitigated to appropriate levels, these risks may be mitigated through training.
5.12.2 b) appropriately reduce the risk of error by the intended user in the handling of the medical device or IVD Risk reduction; Risk of error; Handling;
medical device and, if applicable, in the interpretation of the results. Interpretation of results
5.12.3 Medical devices and IVD medical devices for use by lay users (such as self-testing or near-patient testing Lay users; Self-testing; Near-patient testing
intended for use by lay users) should, where appropriate, include means by which the lay user:
5.12.3 a) can verify that, at the time of use, the medical device or IVD medical device will perform as intended Verification; Intended use; Performance
by the manufacturer, and
5.12.3 b) is warned if the medical device or IVD medical device has failed to operate as intended or to provide a Warning; Failure; Valid result
valid result.

Table A.6: IMDRFEP7.2 – Performance characteristics

EP # EP requirements EP key concepts


7.2.1 Performance Characteristics IVD medical devices should achieve the analytical and clinical performances, Performance characteristics; Analytical
as stated by the manufacturer that are applicable to the intended use/purpose, taking into account the performance; Clinical performance;
intended patient population, the intended user, and the setting of intended use. These performance Validation; State of the art
characteristics should be established using suitable, validated, state of the art methods. For example:
7.2.1 a) The analytical performance can include, but is not limited to, Traceability of calibrators and controls;
a. Traceability of calibrators and controls Accuracy of measurements (trueness and
b. Accuracy of measurement (trueness and precision) precision); Analytical sensitivity/Limit of
detection; Analytical specificity; Measuring
c. Analytical sensitivity/Limit of detection
- 70 -
FG-AI4H-I-036

EP # EP requirements EP key concepts


d. Analytical specificity interval/range; Specimen stability
e. Measuring interval/range
f. Specimen stability
7.2.1 b)The clinical performance, for example diagnostic/clinical sensitivity, diagnostic/clinical specificity, Clinical performance; Diagnostic/clinical
positive predictive value, negative predictive value, likelihood ratios, and expected values in normal and sensitivity; Diagnostic/clinical specificity;
affected populations. Positive predictive value; Negative
predictive value; Likelihood ratios;
Expected values in normal and affected
populations.
7.2.1 c) Validated control procedures to assure the user that the IVD medical device is performing as intended, Validation; Control procedures; Intended
and therefore the results are suitable for the intended use. use
7.2.2 Where the performance of an IVD medical device depends on the use of calibrators or control materials, Calibrators; Control materials; Traceability
the traceability of values assigned to such calibrators or control materials should be ensured through of values; Reference measurement
available reference measurement procedures or available reference materials of a higher order. procedures; Reference materials of higher
order
7.2.3 Wherever possible, values expressed numerically should be in commonly accepted, standardized units and Numerical values; Standardized units; User
understood by the users of the IVD medical device. understanding
7.2.4 The performance characteristics of the IVD medical device should be evaluated according to the intended Performance evaluation; Intended use
use statement which may include the following:
7.2.4 a) intended user, for example, lay user, laboratory professional; Intended user
7.2.4 b) intended use environment, for example, patient home, emergency units, ambulances, healthcare centres, Intended use environment
laboratory;
7.2.4 c) relevant populations, for example, paediatric, adult, pregnant women, individuals with signs and Relevant population; Appropriate
symptoms of a specific disease, patients undergoing differential diagnosis, blood donors, etc. Populations representation; Ethnicity; Gender; Genetic
evaluated should represent, where appropriate, ethnically, gender, and genetically diverse populations so diversity; Representative population;
as to be representative of the population(s) where the device is intended to be marketed. For infectious Prevalence rates
diseases, it is recommended that the populations selected have similar prevalence rates.
- 71 -
FG-AI4H-I-036

Annex B:
IMDRFSaMD Risk Categorization Framework
The IMDRFpublication "Software as a Medical Device: Possible Framework for Risk
Categorization and Corresponding Considerations"characterizes the medical devices by assigning
different risk levels to them based on combination of the significance of the information provided
by the SaMD to the healthcare decision and the healthcare situation or condition as shown in Table
B.1.

Table B.1: IMDRFSaMD risk categories

The four categories (I, II, III, IV) shown in Table B.1 are based on the levels of impact on the
patient or public health where accurate information provided by the SaMD to treat or diagnose,
drive or inform clinical management is vital to avoid death, long-term disability or other serious
deterioration of health, mitigating public health.
The categories are in relative significance to each other. Category IV has the highest level of
impact, Category I the lowest
The criteria for determining (a) SaMD category and (b) Levels of Autonomy are explained as
follows.

B.1 Criteria for determining the SaMDcategory


The criteria for determining whether anSaMD is Category IVare:
– SaMD that provides information to treat or diagnose a disease or conditions in a critical
situation or condition is a Category IV and is considered to be of very high impact.
The criteria for determining whether anSaMD is Category IIIare:
– SaMD that provides information to treat or diagnose a disease or conditions in a serious
situation or condition is a Category III and is considered to be of high impact.
– SaMD that provides information to drive clinical management of a disease or conditions in a
critical situation or condition is a Category III and is considered to be of high impact.
The criteria for determining whether anSaMD is Category IIare:
– SaMD that provides information to treat or diagnose a disease or conditions in a nonserious
situation or condition is a Category II and is considered to be of medium impact
– SaMD that provides information to drive clinical management of a disease or conditions in a
serious situation or condition is a Category II and is considered to be of medium impact.
- 72 -
FG-AI4H-I-036

– SaMD that provides information to inform clinical management for a disease or conditions in
a critical situation or condition is a Category II and is considered to be of medium impact.
The criteria for determining whether anSaMD is Category I are:
– SaMD that provides information to drive clinical management of a disease or conditions in a
non-serious situation or condition is a Category I and is considered to be of low impact.
– SaMD that provides information to inform clinical management for a disease or conditions in
a serious situation or condition is a Category I and is considered to be of low impact.
– SaMD that provides information to inform clinical management for a disease or conditions in
a non-serious situation or condition is a Category I and is considered to be of low impact

B.2 Levels of autonomy


The IMDRFSaMD Categories table was revised to accounted for various levels of autonomy as
shown in the shown in Table B.2 below. Additional levels have been added to the "Treat or
diagnose" category:

Table B.2: IMDRFSaMD risk categories(revised)

Significance of information provided by software to healthcare decision


State of Treat or Treat or Treat or Drive clinical Inform
Healthcare diagnose with diagnose with diagnose with management clinical
situation no intervention override approval management
or possible
condition
Critical VI V IV III II
Serious V IV III II I
Non- IV III II I I
serious

Three different levels of autonomy proposed are:


1. Approval: the software may make suggestions to the user, but either it cannot take action on
its own, or it requires operator approval before taking action.
2. Override: the software can take action without approval, but the operator has the ability to
over-ride (cancel) the software if need be. For example, a human driver in a self-driving car
can take control.
3. No Intervention: the operator is not involved in the treatment and has no ability to override
the software.
- 73 -
FG-AI4H-I-036

Annex C:
Johnerregulatory guidelines for AI- for medical devices
The Johner Guideline for AI-MDs is prepared and released by the JohnerInstitute,Germany.The
guideline is published under the Creative Commons License of type BY-NC-SA.This document is
managed via the version management system git or the GitHub platform. Only the documents listed
in this repository are valid. Full documentation of Johner Guidelines can be found at:
https://github.com/johner-institut/ai-guideline/blob/master/Guideline-AI-Medical-Devices_EN.md.

Johner Guidelines - Objectives


The objective of JohnerGuidelines is to provide medical device manufacturers and notified bodies
instructions and to provide them with a concrete checklist
– to understand what the expectations of the notified bodies are,
– to promote step-by-step implementation of safety of medical devices, that implement artificial
intelligence methods, in particular machine learning,
– to compensate for the lack of a harmonized standard (in the interim) to the greatest extent
possible.
Johner Guidelines - Scope
Johner guidelines do not set forth specific requirements for the products, but for the processes. It
contains the following chapters:
1. General requirements
2. Requirements for product development
a) Intended use
b) Software requirement specification
c) Data management
d) Model development
e) Product development
f) Product release
3. Requirements for phases following development
- 74 -
FG-AI4H-I-036

Annex D:
FG-AI4H data and AI solution quality assessment criteria
Data and AI solution quality assessment criteria were formulated by the ITU-T Focus Group on AI
for Health's DAISAM Working Group,following the data and FGAI4H-F-032-A01: Data and AI
solution assessment methods, governed by FGAI4H-F-103: Updated FG-AI4H data acceptance and
handling policy.
Based on these criteria, a quality assessment questionnaire was prepared to serve as a preliminary
checklist intended to guide the various AI4 Health Topic Groups in following a uniform procedure
for preparing the data and AI solution technical requirements specifications and submitting them in
a common reporting format.
This DAISAM quality assessment questionnaire includes a glossary that contains definitions for
technical terms specific to data and AI solution quality criteria.This is provided to guide the FG-AI4
Health Topic Groups in interpreting the quality assessment checklist in a clear and concise manner
and in mapping the respective technical requirement specifications.
The data and AI solution quality assessment criteria are listed in Table D.1
Table D.1:FG-AI4H data and AI solution quality assessment criteria

AI Model Development Assessment Criteria Description Examples


Workflow
Problem Definition Underlying Task Underlying Task refers to the broad taxonomy – Classification
followed in organizing Machine Learning (ML) – Regression/Prediction
Tasks based on how the solution will be applied – Clustering
to solve or address the specific business problem
– Association rule learning
of the respective practice domain use cases.
Please refer to sections- Level-1A and Level-1B – Decision Support / Virtual Assistance /
of FGAI4H-C-104 for domain use-case thematic Recommendation systems
classifications) – Matching
– Labelling
– Detection
– Segmentation
– Sequential data models
– Anomaly detection and Fraud Prevention
– Compliance Monitoring / Quality Assurance
– Process optimization / Automation
– Other
Data Preparation Input Data Sources, – Input Data refers to the subset of the dataset Input data sources include:
Types & Formats that is used to train the AI model – Electronic Health Records (Anonymised)
– Data Type refers to the type of the different – Medical Images
data attributes involved – Vital signs signals
– Data Format refers to the standard – Lab test results
representation formats of the different data
– Photographs
attributes involved
– Non-medical data-Socioeconomic, Environmental,
etc)
– Questionnaire responses
– Free Text (Discharge / Summary, Medical History /
Notes, etc.)
– Other
- 76 -
FG-AI4H-I-036

AI Model Development Assessment Criteria Description Examples


Workflow

Input Data Types include:


– Real valued
– Integer-valued
– Categorical value
– Ordinal value
– Strings
– Dates
– Times
– Complex data type
– Other

Standard Input Data Formats include:


– DICOM PS3.0 (latest versions)- for Diagnostic
Image (X-Ray, CT,MRI, PET, other pathological
slides, etc)
– JPEG / PNG – for Static Image
– MP3 / OGG – for Audio:
– MP4 / MOV- for Video
– SNOMED – for clinical observations/terminology
– LOINC- for laboratory observations
– WHO ICD-10 for disease classifications
– RxNORM for Medication Code
– Other
Data Preparation Output Data Types Output Data refers to type of data generated by – Binary/Class output (0 or 1) as in case of
the AI Model, when a particular ML algorithm is classification problems
applied on the Input Data – Probability output (0-1) as in case of classification
problems
– Continuous valued output as in case of regression
problems
- 77 -
FG-AI4H-I-036

AI Model Development Assessment Criteria Description Examples


Workflow
Data Preparation Target Data Types Target Data refers to the output data in the – Binary/Class output (0 or 1) as in case of
training dataset that is defined as the reference classification problems
(ground truth) for AI Model validation/testing – Probability output (0-1) as in case of classification
problems
– Continuous valued output as in case of regression
problems
AI Model Selection Model Type Model Type refers to the specific machine Broad Classification of ML Algorithms include:
learning algorithm and its configuration that is – Supervised Learning based algos
applied on the training dataset in order to learn the – Linear Regression
Model
– Logistic Regression
– k-nearest neighbours
– Decision Trees
– Random Forest
– Gradient Boosting Machines
– XGBoost
– Support Vector Machines (SVM)
– Neural Network
– other
– Unsupervised Learning based algos
– k means clustering
– Hierarchical clustering
– Neural Network
– other
– Reinforcement Learning based algos
– Association rule learning based algos
– Apriori algorithm
– Eclat algorithm
– Deep learning based algos
– Convolutional Neural Network (CNN)
– Recurrent Neural Networks (RNNs)
- 78 -
FG-AI4H-I-036

AI Model Development Assessment Criteria Description Examples


Workflow
– Long Short-Term Memory Networks (LSTMs)
– Stacked Auto-Encoders
– Deep Boltzmann Machine (DBM)
– Deep Belief Networks (DBN)
– other
AI ModelEvaluation Evaluation Metrics Metrics used to quantify the errorsand to evaluate – Model Accuracy (%)
the performance quality of the trained model on – Model Accuracy -Mean & Standard Deviation
the test dataset – Model Accuracy –Box Plot Summarization
Selection of metrics depends on the type of the – Root Mean Squared Error(RMSE)
problem & the type of the model under
– Sensitivity (True Positive Rate)
consideration
– Specificity (True Negative Rate)
– F1-Score (class wise performance determination)
– Confusion matrix
– K-fold Cross-validation
– Gain and Lift Charts
– Kolmogorov Smirnov Chart
– Gini Coefficient
– Log Loss
– Area under the ROC curve (AUC)
– Concordant – Discordant Ratio
– Other user defined performance measures
– Other
AI Model Optimization Optimization This deals with the iterative process (feedback Optimization techniques include:
Objective(s) principle) of reconfiguring or tweaking the Model – Adding or deleting Features /Attributes of the input
Parameters to their optimal values in order to data
achieve the desired level of accuracy or – Aggregating or Decomposing Features /Attributes
performance score in comparison with the of the input data
baseline definition.
– Tuning Model Hyper-parameters
– Normalization & Standardization of input data
Model performance can besystematically tracked
– Changing the learning rate of the algorithm
- 79 -
FG-AI4H-I-036

AI Model Development Assessment Criteria Description Examples


Workflow
by maintaining progressive versions of Code, – Examining the Statistical Significance of results
Model, and Data. – Recruiting Ensemble Methods for combining /
augmenting the prediction scores of multiple models
– Monitoring and tracking API response times and
Computational Memory requirements of the serving
infrastructure
– Etc.
Safety Standards Safety tool(s)training This deals with the user training/orientation given Safety Risk Mitigation and Management Plan &
Compliance on how to identify potential human safety risks Procedure
occurring due to accidental or malicious misuse of
the technology involved in AI Model deployment
Safety tool(s) This deals with the incorporation of necessary – Adopting governance procedures to assert
deployment preventative system measures/tools as per the alternative system fault tolerance plans
defined Risk Mitigation Plan to ensure that no – Adopting security mechanisms like
damage or harm is caused to human safety out of o Authentication
potential physical or cyber-attacks on the AI
o Role based Access Control
Model being applied.
o Encryption
o Transport Level Security
o Informed Consent
o Anonymisation
o etc
– Maintaining Data Audit Logs for secure content
verification, based on
o Blockchain Technology
o Merkle Trees
o etc
– Implementing Security Standards based onDigital
Certificate, SSL, SHA-256, etc
- 80 -
FG-AI4H-I-036

AI Model Development Assessment Criteria Description Examples


Workflow
AI Model Testing Test Data Quality Tests Test Data refers to the subset of the dataset and Standard Test Options include:
not part of the training dataset that is used to
evaluate the ML Model accuracy after its primary – Training and testing on the same dataset
vetting by the validation dataset
– Split tests
– Multiple split tests
Quality tests are performed to minimize the noise
– Cross validation
and variance of the test data in order to maximize
the performance accuracy of ML algorithm – Multiple cross validation
applied on it – Statistical significance
Annex E:
ITU ML5G high-level requirements mapping to AI for health requirements
Requirements analysis was performed on the ITU-T FG-ML Technical Specification "Unified
architecture for machine learning in 5G and future networks" to identify high-level requirements
that could be translated and applied for regulatory assessment of AI-MDs. The list of high-level
requirements is given in the following tables.
ITU ML5G Req. Code ML-unify-001
Requirement Multiple sources of data are recommended to be used to take advantage of
correlations in data.
Description In future networks, sources of data may be heterogeneous, integrated with
different NFs, and may report different formats of data. These varied
"perspectives" can provide rich insights upon correlated analysis.
Example: Analysis of data from UE, RAN, CN and AF is needed to
predict potential issues related to quality of service (QoS) in end-to-end
user flows.
Thus, an architecture construct to enable the ML pipeline to collect and
correlate data from these varied sources is needed.
Relevance for Healthcare / Despite explicit parameters set for representativeness, a small number of
Assessment Sven (MXH) data sources increases the risk of a data bias. Since always application
dependent, this should not be required but recommended.
Required / Recommended? Recommended

ITU ML5G Req. Code ML-unify-005


Requirement Logical entities of the ML pipeline are required to be capable of splitting
their functionalities or be hosted on separate technology-specific nodes.
Similarly, multiple logical entities are required to be capable of being
implemented on single node.
Description In future networks, HAS for NFs will optimize the location and the
performance accordingly. The network function virtualization orchestrator
(NFVO) plays an important role in this. To carry forward such benefits to
the ML use case, similar optimizations should also be applied to ML
pipeline nodes. Moreover, the constraints applicable to an ML pipeline
[e.g., training may need a graphic processor unit (GPU) and may need to
be done in a sandbox domain] may be unique.
Relevance for Healthcare / This roughly falls into the category of distributed training / inference /
Assessment Sven (MXH) federated learning. At MXH, we are not too convinced by the latter yet,
since far away from concrete application and too high-level to make a
reasonable judgement on. For pragmatic reasons, we would assess this as
'recommended'.
Required / Recommended? Recommended

ITU ML5G Req. Code ML-unify-011


Requirement Intention is required to specify the sources of data, repositories of models,
targets/sinks for policy output from models, constraints on resources / use
case.
Description The separation between technology agnostic part of the use case and
technology-specific deployment (e.g., 3GPP) is captured in the design time
- 82 -
FG-AI4H-I-036

of future network services. Intent specification for the ML use cases


achieves this separation for the ML overlay. See clauses 3.2.5 and 3.2.6
for definitions.
Relevance for Healthcare / Specification of data sources is required to provide transparency on
Assessment Sven (MXH) robustness, e.g. to exclude misfit situations with unclear model outcome.
Required / Recommended? Required

ITU ML5G Req. Code ML-unify-017


Requirement Model training is required to be done in the sandbox using training data.
A sandbox domain is recommended to optimize the ML pipeline.
Simulatorfunctions hosted in the sandbox domain may be used to derive
data foroptimizations.
Description Model training is a complicated function, it has several considerations: use
of specific hardware for speed, availability of data (e.g., data lakes),
parameter optimizations, avoiding bias, distribution of training (e.g.,
multi-agent reinforcement learning), the choice of loss function for
training. The training approach used exploration of hyper parameters, for
example.
Moreover, in future networks, operators will want to avoid service
disruptions while model training and updates are performed.
These considerations point to the use of a simulator for producing the data
for training the models, as well as its use in a sandbox domain.
Relevance for Healthcare / Separation of development and production setting is required because
Assessment Sven (MXH) uncontrolled, continuous learning imposes the risk of unexpected model
biases.
Required / Recommended? Required

ITU ML5G Req. Code ML-unify-018


Requirement The capabilities to enable a closed loop monitoring and update, based on
theeffects of the ML policies on the network, are required.
Description Closed loop is needed to monitor the effect of ML on network operations.
Various KPIs are measured constantly and the impact of the ML algorithm
on them as well as on the ML pipeline itself (due to operations of the
MLFO) are monitored and corrected constantly. These form inputs to the
simulator that generate data. These data can cover new or modified
scenarios accordingly in future (e.g., a new type of anomaly is detected in
the network, the simulator is modified to include such data. which can also
train the model to detect that data type).
Relevance for Healthcare / Sounds like monitoring of ML algorithm performance in the production
Assessment Sven (MXH) setting. Reasonable thing to do in order to be able to intervene if outcomes
don't hold up to expectations and might cause risks to patient safety.
Required / Recommended? Required

ITU ML5G Req. Code ML-unify-019


Requirement A logical orchestrator (MLFO: ML function orchestrator) is required to be
usedfor monitoring and managing the ML pipeline nodes in the system.
MLFO monitors the model performance, and model reselection is
recommendedwhen the performance falls below a predefined threshold.
Description The varied levels and sources of data (core, edge), including the simulator
- 83 -
FG-AI4H-I-036

and the sandbox domain, imply that there could be various training
techniques including distributed training. Complex models that are
chained (or derived) may in fact be trained using varied data. The
performance of such models can be determined and compared in the
sandbox domain using a simulator.Based on comparisons, operators can
then select the model for specific use cases. This can be used in
conjunction with the MLFO to reselect the model. Note: evaluation may
involve network performance evaluation along with model performance.
Relevance for Healthcare / Sounds like previous point, monitoring of model outcomes
Assessment Sven (MXH)
- 84 -
FG-AI4H-I-036

Annex F:
DIN SPEC 92001-AI devices life cycle processes requirements

DIN SPEC 92001-1

ICS 35.080; 35.240.01

Artificial Intelligence – Life Cycle Processes and Quality Requirements – Part 1: Quality
Meta Model;

1. Introduction

Challenge: For these reasons, quality assessment of an AI module still poses a major challenge. It
becomes more difficult to confirm, verify, and validate an AI module during conception,
development, deployment, operation, and retirement which are wide-ranging tasks.

Abstract: This document introduces an AI quality meta model to outline key aspects of AI quality
including the previously mentioned AI quality pillars. For AI quality analysis, an approach for risk
evaluation and a suitable software life cycle are provided. The given AI life cycle is consistent with
the international standard for systems and software engineering. The second part of this
specification, DIN SPEC 92001-2, provides specific AI quality requirements.

Scope

Purpose Establish a quality-assuring and transparent


life cycle of AI modules. Critical quality
criteria are identified and AI-specific
problems are addressed. To achieve this,
this document presents a set of quality
requirements that are structured in an AI
specific quality metamodel. It is important
to note that not all AI modules impose the
same quality requirements. document
proposes the differentiation between AI
modules with regard to their safety, secure

The document outlines and defines the


three central quality pillars functionality &
performance, robustness, and
comprehensibility.

Field of Application This document applies to all life cycle


- 85 -
FG-AI4H-I-036

stages of an AI module — concept,


development, deployment, operation, and
retirement — and addresses a variety of
different life cycle processes.

2. Terms and definitions

For the purposes of this document, the following terms and definitions apply.

DIN and DKE maintain terminological databases for use in standardization at the following
addresses:

— DIN-TERMinologieportal: available at https://www.din.de/g

3. Quality Meta model

The key quality characteristics, the so-called quality pillars, that need to be taken into account
throughout the whole life cycle of an AI module, are functionality & performance, robustness and
comprehensibility. These three quality pillars are not fully disjoint. For instance, robustness may be
conceived as part of functionality & performance, since the adaptation to unknown environments
can be a functionality requirement in a given application. In this way, AI modules are divided into
two risk classes. In the following, AI modules with safety, security, privacy, or ethical relevance are
summarized in components with (potentially) high risk and the latter in components with low risk.
For high risk AI modules, a deviation from the quality requirements is either not permitted or is to
be justified, while for low risk AI modules this is less strict.

This document, each AI module is considered to be either of high or low risk or it is assumed that a
mapping of internal risk classes to high risk and low risk, respectively, is carried out. For safety,
security, privacy, or ethically relevant AI modules this document requires the consideration of all
listed quality requirements. Potential deviations of such AI modules need a profound justification.

1. AI parts Module of the and AI Software quality metamodel System are Relation

Software systems are composed of interacting system elements, where each has its own purpose and
requirements, respectively. The AI module is one of these elements that consist of AI methods and
algorithms, respectively. As an element of the software system, it relates to and interacts with other
elements such as hardware, software or data and with the surrounding environment such as humans.
Henceforth, this document focuses on the quality assurance of AI artifacts within the software
system. These artifacts can be hybrid systems. It is required to keep in mind that further standards,
requirements, and regulations can apply to the overall software system and consequently to the AI
- 86 -
FG-AI4H-I-036

module. In order to give a framework for DevOps of trustworthy AI modules, a quality metamodel
is proposed and described in this document.

2. Risk Evaluation

Risk-
Description
grade

AI modules (so called “critical” AI modules) have safety, security,


High-risk privacy, or ethical relevance. Domains with such relevance can be
autonomous driving, medical diagnostics, and credit ratings.

For low risk AI modules, deviations from recommended requirements


are permitted without further justification. A deviation from highly
recommended requirements for low risk AI modules is only permitted in
exceptional cases and with appropriate justification, whereas deviations
from mandatory requirements such as the establishment of a risk
Low-risk
identification and assessment process are not accepted. Deviations from
recommended and highly recommended requirements are only permitted
in exceptional cases and with appropriate justification, whereas
deviations from mandatory requirements are not allowed.

Low risk is called “comfort” AI modules.

3. Environment, Platform, Dthis is our ata, Model

Model type Description

Model Space The model space includes all sets of potential


approaches to solve the problem task at hand.
Algorithms, mathematical models, architectures, and
parameter configurations that can lead to suitable
solutions for the prescribed task are included within
this set.

Inference Model The inference model is one specific element of the


model space. Thus, it is composed of particular model
architecture with a fixed parameter configuration. This
configuration is derived from the model space via a
selection method, such as a training algorithm on some
data set. The inference model can be used to solve the
intended task to a certain degree.
- 87 -
FG-AI4H-I-036

3. Life Cycle

1. General

Stage Definition Context of AI

Concept Creation of all process and defining of Additionally, acceptance


the problem definition, analysis, and criteria should be defined for
finding a suitable model space. Based further quality assurance
on the specific problem suitable steps. It is, for instance,
models should be identified and recommended to
analyzed concerning properties like operationalize the problem
convergence and input assumptions. such that its formulation
In this stage, no model hyper contains possible actions for a
parameters are chosen and no final solution.
model evaluation is done.

Development Means a number of activities, Data driven development


including the system design and approaches are used to
specification, prototyping and construct an interference
implementation, integration, bug model in connection with
tracking and bug fixing, verification classical software engineering
and validation including testing on approaches: Such activities
various levels (functional, integration, contain data acquisition, data
testing, performance & robustness), analysis, and the actual
packaging, documentation, programming or training
versioning, etc. efforts. In the case of ML
models, the data set should be
analyzed, understood, and
variables that are relevant for
the goal or problem should be
identified. In this stage, model
hyper parameters are
compared concerning the
quality of the specific model.
Different measures and
metrics for the evaluation of
the model quality can be
considered. The aim is to find
one model with specific hyper
parameters that adequately
solves the problem. The
representation of the data set
is possibly adapted to the
chosen model since some ML
- 88 -
FG-AI4H-I-036

models need a specific input


shape.

Deployment Transition from development to 2 levels:


operation.
a) High degree of database
learning, deployment includes
the training of the model on
the host system and the export
to the target system.

b) Low degree of data based


learning: the transition from
host to target system is also
relevant. For instance, the
acceptance of the AI module
by the stakeholder is part of
the target system and has to be
obtained.

Note that deployment starts


the operation stage. Therefore,
it is impossible to delineate
clearly between deployment
and operation.

Operation Maintenance and evaluation aspects in Since ML algorithms can


the environment where the AI module continue to learn from data
is used. through online learning and
thus continue to change after
training in the experimental
environment.

Retirement Disintegration and discontinuation of This stage can be deleted from


the AI module as well as the transition the software system or
to a new AI module significantly changed such
that a new AI module is
created. This starts a new life
cycle. Thus, this can be
interpreted as a retirement of
the original AI module as
well.

One important in these stages is that everything is part of development stage.


- 89 -
FG-AI4H-I-036

2. Life Cycle Processes

Processes are defined by title, purpose, and outcome.

a) Organizational project-enabling processes: This part is important to concept and provides each
asset to make the project work and obtain all the expectations of company stakeholders. Most
processes within this group are only slightly affected by new challenges introduced by AI.
Nevertheless, the user of this document needs to evaluate whether changes to existing processes are
required. For instance, ways in which these processes need to be refined include establishing quality
evaluation criteria that are applicable to functionality & performance, robustness, and
comprehensibility of AI modules.

b) Technical management processes: “are concerned with managing the resources and assets
allocated by organization management and with applying them to fulfill the agreements into which
the organization or organizations enter [...]. In particular they relate to planning in terms of cost,
timescales and achievements, to the checking of actions to help ensure that they comply with plans
and performance criteria and to the identification and selection of corrective actions [...]”.
Additionally, specific measures with respective quality criteria need to be defined that allow
evaluating if the AI module satisfies functionality & performance, robustness, and
comprehensibility criteria.

c) Technical processes: “transform the needs of stakeholders into a product or service by means of
technical actions throughout the life cycle”. They ensure that sustainable performance and overall
quality is reached when the AI module is applied. This is the group of processes that is mostly
affected by AI-specific challenges. An important aspect that needs to be considered within the
system analysis process is, for instance, to ensure the needed extent of interpretability of the AI
module.

Agreement processes is a part of process group but IN THIS DOCUMENT, authors did not use.

Agreement processes “are organizational processes that apply outside of the span of a project’s life,
as well as for a project’s lifespan. Agreements allow [...] to realize value and support business
strategies for […] organizations.” [3]. While agreement processes apply to the overall software
system, they bear no reference to one software component and AI-specific challenges. Thus, this
DIN SPEC does not include agreement processes.

3. AI Quality Pillars

AI quality characteristics in the form of requirements need to be considered.


- 90 -
FG-AI4H-I-036

The document introduces an approach to cover a sufficiently wide spectrum of AI-related software
quality aspects and to emphasize the importance of AI-specific requirements. It enables the
development and implementation of performance, robust, safe, and trustworthy AI modules.

Table 2. Three key qualities

Key Quality Definition AI Meaning

Functionality & The degree to which an AI Performance evaluation and


performance module is capable of model selection are further
fulfilling its intended task topics that are addressed in
under stated conditions. this quality pillar. It is
required to precisely define
the problem or goal before
development and analyze it
with respect to constraints
and assumptions concerning
environment, platform,
data, and model. After
problem analysis, potential
solutions need to be
formalized and evaluated.
To find suitable solutions,
adequate performance
measures and metrics shall
be chosen for the given task
and data.

Robustness The ability of an AI module Therefore, the AI module’s


to cope with erroneous, robustness needs to be
noisy, unknown, and adequately quantified and
adversarial input data. Due meet requirements that are
to the complexity of the AI defined in the risk analysis.
module’s environment that The dependence of the
can result from its non- model on environment,
stationary and high- platform, and data has to be
dimensional, robustness is a considered. Distributional
key AI quality issue. shifts occur when the AI
module is exposed to data
points outside the training
or testing data set. The
possibility of an adversarial
attack must be specifically
addressed, since this poses a
major risk to the operation
of AI modules in safety and
- 91 -
FG-AI4H-I-036

security relevant settings.


For this, the adversary’s
knowledge of the AI
module and the perturbation
scope, respectively, are to
be assessed and defense
strategies are required to be
chosen accordingly and
continuously monitored
during development and
deployment.

Comprehensibility The degree to which a This means that the AI


stakeholder with defined component is transparent
needs can understand the and explainable.
causes of an AI module’s Furthermore, a qualitative
output. The causes include understanding between the
the reason for a specific input variables and the
output, i.e. the input leading response is provided with
on to it, and the whole respect to the stakeholder’s
process of decision-making. level of expertise and need
for comprehension. For
instance, the developer of
an AI module needs to
understand not only the data
and inference model but
also the model space and
the mathematical
framework. This quality
pillar focuses on the
transparency and
interpretability of the
chosen model. If you
doesn’t explain to the
stakeholder clearly (white-
box), you can create some
difficulties to the project
(grey-box or black-box).

Conclusion of quality assurance

3 parts of quality assurance is the life cycle, influencing factors, and three quality pillars. The
project manager needs to join different points like the influencing factors environment, platform,
data, and model. It raises awareness of possible quality issues that can arise during the different life
- 92 -
FG-AI4H-I-036

cycle stages and processes of the AI module. The points to consider when the project manager in
the life cycle is guided by the three quality. All requirements for quality assurance are collected in
these quality characteristics. Thus, the AI quality meta model covers all aspects of AI quality
assurance.

Bibliography

Systems and software engineering— Softwarelifecycleprocesses.ISO/IEC/IEEE


12207:2017.Techrep.ISO, IEC and IEEE, 2017.
- 93 -
FG-AI4H-I-036

Annex G:
AI4H Project Deliverable Reference

Table G: AI4H Project Deliverable Reference ID

AI4H Project Deliverable ITU Document Reference ID

AI Software Life Cycle Specification FG-AI4H-G-204

AI4H regulatory [best practices | FG-AI4H-G-202


considerations]

Mapping of IMDRF Essential Principles to AI FG-AI4H-G-038-A01


for Health Software

Data Annotation Specification FG-AI4H-G-205-A03

AI4H Training Best Practices Specification FG-AI4H-G-206

AI4H Evaluation Specification FG-AI4H-G-207

AI Technical Test Specification FG-AI4H-G-207-A02

AI4H Ethics Considerations FG-AI4H-G-201

AI4H Applications and Platform FG-AI4H-G-200

AI4H Scale-up and Adoption FG-AI4H-G-

____________________________

You might also like