You are on page 1of 23

Explainable AI for Bioinformatics: Methods, Tools, and Applications

Md. Rezaul Karim1,2† , Tanhim Islam1 , Oya Beyan3, 1 , Christoph Lange1,2 , Michael Cochez4,5 , Dietrich
Rebholz-Schuhmann6,7 and Stefan Decker1,2
1
Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Germany
2
Chair of Computer Science 5 - Information Systems and Databases, RWTH Aachen University, Germany
3
University of Cologne, Faculty of Medicine and University Hospital Cologne, Institute for Biomedical Informatics
arXiv:2212.13261v1 [q-bio.QM] 25 Dec 2022

4
Department of Computer Science, Vrije Univeriteit Amsterdam, the Netherlands
5
Elsevier Discovery Lab, Amsterdam, the Netherlands
6
ZBMED - Information Center for Life Sciences, Germany
7
Faculty of Medicine, University of Cologne, Germany

Corresponding author: rezaul.karim@rwth-aachen.de. This paper is currently under review in Briefings in Bioinformatics.

Abstract
Artificial intelligence (AI) systems based on deep neural networks (DNNs) and machine learning (ML) algorithms are increasingly used
to solve critical problems in bioinformatics, biomedical informatics, and precision medicine. However, complex DNN/ML models that are
unavoidably opaque and perceived as black-box methods, may not be able to explain why and how they make decisions. Such black-box
models are difficult to comprehend not only for targeted users and decision-makers but also for AI developers. Besides, in sensitive areas
like healthcare, explainability and accountability are not only desirable properties of AI but also legal requirements – especially when AI
may have significant impacts on human lives. Fairness is another rising concern in these contexts – especially any algorithmic decisions
should not favour or discriminate against certain groups or individuals based on sensitive attributes. Explainable artificial intelligence (XAI)
is an emerging field that aims to mitigate the opaqueness of black-box models and make it possible to interpret how AI systems make their
decisions. An interpretable ML model can explain how it makes predictions and which factors affect model’s outcomes. The majority of
state-of-the-art interpretable ML methods have been developed in a domain-agnostic way and originate from computer vision, automated
reasoning, or even statistics. Many of these methods cannot be directly applied to bioinformatics problems, without prior customization,
extension, and domain adoption. In this paper, we discuss the importance of explainability with a focus on bioinformatics. We analyse
and comprehensively overview of model-specific and model-agnostic interpretable ML methods and tools. Via several case studies covering
bioimaging, cancer genomics, and biomedical text mining, we show how bioinformatics research could benefit from XAI methods and how
they could help improve decision fairness. We believe this review will provide valuable insights and serve as a starting point for researchers
wanting to improve explainability and transparency while solving emerging research and practical problems in bioinformatics.

Keywords: Machine learning, Deep learning, NLP, Interpretable machine learning, Explainable AI, Bioinformatics, Healthcare.

Introduction cation of AI in precision medicine is predicting what treatment


protocols are likely to succeed on a patient based on patient
Artificial intelligence (AI) systems that are built on machine phenotype, demographic, and treatment contexts [19].
learning (ML) and deep neural networks (DNNs) are increas- Biomedical data science includes various types of data
ingly deployed in numerous application domains such as military, such as sequencing data, multi-omics, imaging, EMRs/clinical
cybersecurity, healthcare, etc. Further, ML and DNN models outcomes, and structured and unstructured biomedical texts
are applied to solving complex and emerging biomedical research data [33], where ML methods are typically used for the anal-
problems, covering text mining, drug discovery, single-cell RNA ysis and interpretation of multimodal data (e.g., multi-omics,
sequencing, early disease diagnosis and prognosis. Disciplines imaging, clinical outcomes, medication, disease progression,
such as systems medicines apply holistic approaches to under- etc.), in a multimodal data setting. Further, many datasets,
stand complex physiological and pathological processes, thereby including bioimaging and omics are of increasing dimensionality.
forming a basis for the development of innovative methods for The surge of the massive amounts of biomedical data not only
the diagnosis, therapy, and prevention strategies mitigating brings unprecedented progress in bioinformatics and oppor-
disease conditions. Clinical data can be combined and fully an- tunities to perform predictive modelling at scale [33], it also
alyzed by employing ML methods for new biomarker discovery. introduces challenges to existing AI methods and tools such
Further, the paradigm of evidence-based precision medicine as data heterogeneity, high-dimensionality, and volume [50].
has evolved toward a more comprehensive analysis of disease Principal component analysis (PCA) and isometric feature
phenotype and their induction through underlying molecular mapping (Isomap) are extensively used as dimensionality re-
mechanisms and pathway regulation. Another common appli- duction techniques [26]. However, the representations learned
2 Karim and Decker et al., Explainable AI for Bioinformatics

by these methods often lose essential properties [2], making been developed in a domain-agnostic way [33]. Therefore, may
them less effective against a known phenomenon called curse of not be inherently suitable for all biomedical and bioinformatics
dimensionality, particularly in high-dimensional datasets [26]. problems and diverse data types. Therefore, these methods
A complex DNN model may tackle this issue very well and need to be customized or extended in order to fit individual
benefit from higher pattern recognition capabilities when it needs in bioinformatics and a variety of data types [33].
comes to feature extraction and learning useful representa- In this paper, we discuss importance and advantages of
tion from high dimensional data. A concrete example is using explainability with a focus on bioinformatics. First, we pro-
autoencoders (AEs) for unsupervised learning tasks. AEs archi- vide a comprehensive overview of different interpretable ML
tectures with multiple hidden layers and non-linear activation methods. Then, via several case studies covering bioimaging,
functions learn to encode complex and higher-order feature cancer genomics, and biomedical text mining, we show how
interactions. Learning non-linear mappings allows transforming bioinformatics research could benefit from XAI methods and
input feature space into a lower-dimensional latent space. The how they could help improve decision fairness. Finally, we
resultant embeddings that can capture important contextual provide a brief overview of interpretable ML tools and libraries.
information of underlying data can be used for different down-
stream tasks [50]. However, latent factors learned by AEs are Importance of XAI in Bioinformatics
not easily interpretable, whilst disentangling them can provide
Large-scale biomedical data is not only heterogeneous, highly di-
insight into what features are captured by the representations
mensional, and unstructured, but also has noise and high levels
and what attributes of the samples the task is based on [50].
of uncertainty. However, not only the data-driven nature and
AI systems that are already deployed in selected application technical complexity, the adoption of data-driven approaches in
areas in healthcare and are already outperforming medical many bioinformatics application scenarios could be hampered
experts at selected tasks such as spotting malignant tumours, due to the lack of efficient ML models that are able to tackle
modelling disease progression, survival analysis, and treating these challenges. Yet, an AI system is expected to perform in
cancerous conditions. However, the adoption of AI-driven a reliable and safe manner to provide trustworthy decisions.
approaches in many clinical settings has been hampered by Although not all the predictions need to be explained, an ML
the lack of efficient AI solutions that are able to deal with model itself needs to be interpretable since higher interpretabil-
large-scale high dimensional biomedical data, and have high ity of the model means easier comprehension for the targeted
levels of uncertainty. Although DNN models can solve very users [94]. Therefore, the need for efficient ML models with ex-
complex problems, with increasing complexity owing to non- plainability are getting stronger, so that transparency, fairness,
linearity and higher-order interactions among a large number and accountability of AI can be enforced for mission-critical
of features, they tend to be less and less interpretable and may situations [18]. As outlined in fig. 6, interpretable ML methods
end up as black-box methods. Their internal working principles provide several advantages over their black-box counterparts,
are unavoidably opaque and algorithmic complexity is simply irrespective of the example introduced in the next section.
beyond humans to comprehend. Mathematical certainty means
it should be possible to transparently show that no hidden Helps avoid practical consequences
behaviour or logic influencing the behaviour of a model [77].
A critical application of AI is aiding the treatment of cancerous
Since the way how the predictions made by such a black-box conditions, having different types and subtypes. An early diag-
can neither be traced back to the input, nor it is clear why nosis of cancer, by classifying the patients into high or low-risk
outputs are transformed in a certain way. If an AI system groups is a key to facilitating subsequent clinical management
is built on such complex models and employed in biomedical of patients [59]. Let’s introduce a more concrete example of a
data science, it will not only lack explainability but also have clinical diagnosis of cancer for a breast cancer patient: a doctor
trustworthiness and transparency issues. Further, these limita- suspects a patient could have breast cancer. Since breast cancer
tions will evolve into another serious drawback: the inability to is one of the leading causes of death in women, the patient
answer follow-up questions by human users and get contrastive case requires careful investigation. By acquiring deep insights
or causal explanations. The emerging explainable artificial into omics data (e.g., somatic mutations, gene expression (GE),
intelligence (XAI) field, aims to mitigate the opaqueness of DNA methylation, copy number variations (CNVs), and miRNA
black-box models by interpreting how AI systems should make expressions), accurate diagnosis, as well as treatment, can be
decisions. The primary objectives of XAI are improving human focused on preventive measures. Let’s assume a multimodal
understandability, decision reasoning capability, transparency, DNN model trained on multi-omics data can distinguish (clas-
and accountability of AI systems [33]. An interpretable ML sify) cancerous samples from healthy samples with an accuracy
model can identify factors (e.g., statistically significant features) of 95%. Assume a patient is diagnosed with the trained model
that can be used to explain the way they affect the model’s with breast cancer with a probability of 70%, and asks several
outcome and the way they interact. Although linear models, follow-up questions: i) “why do I have breast cancer?”, ii) “how
decision trees (DTs), and rule-based systems are less complex did the model reach this decision?”, iii) “which biomarkers or
and intrinsically interpretable, they are less accurate compared factors are responsible for it?”. However, since the representa-
to the tree-based ensemble and DNNs models that have been tions learned by the multimodal DNN architectures may not be
shown to outperform simple models. This phenomenon called easy to interpret, it may not be possible to transparently show
the accuracy vs. interpretability trade-off is shown in fig. 1. what features the representations captured and what attributes
Recently, XAI has gained a lot of attention from both of the samples the classification would be based on.
academia and industry [97]. Numerous model-specific and The diagnosis may further depend on several distinct molec-
model-agnostic interpretable ML methods have been proposed, ular subtypes and factors like estrogen-, progesterone-, and
with view to improve local and global interpretability of ML human epidermal growth factor receptors. Therefore, the AI
models [100]. However, majority of these methods have not system may require using multimodal data (e.g., multi-omics,
Karim and Decker et al. 3

Figure 1 Accuracy vs. interpretability trade-off - complex models tend to be less and less interpretable [52]

Figure 2 Example of practical consequence: a black-box model cannot explain diagnosis decisions [52]

bioimaging, and clinical data) and creating a multimodal DNN sequence is for predicting up-regulation across all genes in the
architecture. Further, image-guided pathology may need to dataset, while local interpretation strategy will tell us how
employ in order to analyze imaging data (e.g., breast histopatho- important the sequence is for predicting a gene as up-regulated.
logical image analysis), as shown in fig. 3. Inability to: i) explain
the decision by considering multimodal data settings, and ii) to Reduces complexity and improves accuracy
answer such follow-up questions means the model is technically The goal of genomics data analysis is to disseminate biologi-
a black-box model, as shown in fig. 2. Any AI systems built on cally meaningful information about genes and proteins since
black-box models will be black-box too, where internal logic is it will provide a clearer understanding of the roles a certain
either intentionally or strategically hidden from users [18]. A set of genes play in cancer development. However, biological
black-box model would not allow tracing how and why an input processes in humans are not single-gene-based mechanisms, but
instance is mapped to a decision and vice versa. Opacity is complex systems controlled by regulatory interactions between
one of the major issues in black-box ML – a problem involving thousands of genes [33]. Omics data are very high-dimensional,
theoretical, practical, and legal consequences [107]. Therefore, where a large number of genes may be irrelevant for a learning
such a black-box model cannot be used in a clinical setting. task, e.g., cancer type prediction. For example, with more than
An interpretable ML model that emphasizes transparency 30,000 genes in GE data, the size of the feature space is so huge
and traceability for its underlying logic can reason why and that the training data will be sparse. Besides, sample sizes are
how it reached certain decisions, thereby avoiding practical barely composed of a few tens to hundreds, e.g., in the case of
consequences. For our cancer diagnosis example, the local in- clinical trials. The inclusion of all features not only introduces
terpretability explains why a specific decision was made for an unwanted noise impeding predicting power of a model but also
instance or by providing similar examples of instances to the increases the computational complexity.
target instance [94]. Thus, we can emphasize specific charac- Therefore, selecting biologically significant features with
teristics of a patient representing a smaller group, yet different high individual correlation w.r.t classes of interest is important
from other patients. In contrast, global interpretability out- such that no correlation exists among the genes. Accurate
lines the overall behaviour of the model, at an abstract level. identification of such cancer-specific biomarkers: i) helps im-
Another concrete example in the same context could be as prove classification accuracy, ii) allows biologists to study the
follows: suppose an ML model is trained to predict if a gene is interaction of relevant genes, and iii) helps study their func-
up-regulated (i.e., label) after some treatment based on the pres- tional behaviour to facilitate further exploration and discovery
ence or absence of a set of regulatory sequences (i.e., features). of other genes. Once such biomarkers are identified based on
Global interpretability will tell us how important regulatory some individual feature attributions, they can be ranked ac-
4 Karim and Decker et al., Explainable AI for Bioinformatics

Figure 3 AI for cancer diagnosis in a clinical setting [52]

cording to their relative or absolute importance. Further, the score, they showed that black patients are considerably sicker
identified genes can be treated as cancer-specific marker genes than white patients, pointing out that remedying this disparity
that characterize specific or multiple tumour classes. would increase the percentage of black patients receiving addi-
tional help from 17.7 to 46.5%. Similar scenarios like these can
Improves decision fairness further undermine the trust of healthcare experts and other
affected bodies. As for our cancer example, predictions based
With the wide adoption of AI, it is also important to take
on a biased model can disproportionately impact the diagnosis,
fairness issues into consideration since AI systems can be used
which can tend a doctor to make the wrong treatment too.
in many sensitive environments to make important and life-
changing decisions [94]. Biases are one of the main obstacles to Since humans are naturally biased, data collected or prepared
fair decision-making. Therefore, fighting against bias and dis- by humans will always have minimal biases. Therefore, being
crimination has a long history in philosophy and psychology [71]. aware of common human biases that could manifest in the
Statistically, bias means an inaccurate representation of what data is important, so that proactive measures can be taken to
is true w.r.t the underlying population. Biases can manifest mitigate their effects before training a model. Fair decision-
in any part of an ML pipeline [71, 103]: from data collections, making requires extensive contextual understanding [71], where
feature selection, model training, setting hyperparameters, and interpretability and explainability can help identify sensible
the way results interpreted to affected users, e.g., patients. For attributes and variables that could drive unfair outcomes. Once
example, in healthcare and biomedical domains, manifesting identified, proactive measures can be taken to ensure that the
representative-, selection-, and discriminatory biases can easily decisions would not reflect discriminatory behaviour toward
be ingested in into biophysical data, which will raise fairness certain groups or populations due to those variables [72].
issues and can lead to unfairness in different learning tasks [71].
ML algorithms are always a form of statistical discrimination Internal governance and legal compliance
that could become objectionable when it places certain privi- With increasing adaptions of AI across domains, the need to
leged groups at a systematic advantage and certain unprivileged explain the decisions made is important for ethical, judicial,
groups at a systematic disadvantage. A model trained on biased and safety reasons [18]. For example, in a sensitive area like
data will be biased. If an AI system is developed based on ML healthcare, explainability and accountability are not only some
models, it will produce biased decisions too. A recent study [81] desirable properties but also legal requirements – especially
has shown that a widely used algorithm can affect millions of where AI may have an impact on human lives. The European
patients, and exhibits significant racial bias. At a given risk Union General Data Protection Regulation (EU GDPR) [45]
Karim and Decker et al. 5

Figure 4 An interpretable model can explain decisions locally, outlining global behaviour (conceptually recreated based on [94]

Figure 5 Different types of biases involved in a typical ML pipeline (recreated based on [76])

Figure 6 Advantages of XAI in improving interpretability, trust, transparency, and fairness in algorithmic decision-making
6 Karim and Decker et al., Explainable AI for Bioinformatics

advocates important realization of AI such as ethics, account- a system understandable to humans [77]. Algorithmic trans-
ability, and robustness (e.g., adversarial attacks could fool an parency suggests factors that influence the decisions made by
ML model during decision-making). Further, it enforces that algorithms should be visible, or transparent, to the people who
processing based on automated decision-making tools should use, regulate, and are affected by systems that employ those
be subject to suitable safeguards, including “right to obtain an algorithms [23]. An ML model is transparent if it is expressive
explanation of the decision reached after such assessment and to enough to be human-understandable, where transparency can
challenge the decision” and individuals “have the right not to be be a part of the algorithm itself or using external means (e.g.,
subject to a decision based solely on automated processing and model decomposition or simulations) [18].
whenever human subjects have their lives significantly impacted Although a white-box model improves model debugging and
by an automatic decision-making machine”. promotes trust, knowing its architecture and parameters alone
Since AI plays a central role in the interaction between orga- won’t make the model explainable [18]. Thereby, an inter-
nizations and individuals [73], explaining AI-assisted decisions pretable model can outline how input instances are mapped to
to affected individuals require those within an organization to certain outputs by identifying statistically significant features.
understand the overall decision-making processes [53]. GDPR While explainability is using the knowledge of what those fea-
prohibits the use of AI for automated decision-making unless tures represented and their relative importance in explaining
clear explanations of the logic involved are provided in a human- the predictions to humans in an understandable way or terms.
interpretable way. To avoid any legal consequences, how the We refer to some terminologies and notations from our early
ML model makes predictions should be as transparent as possi- work [47] that are used in the remainder of the paper.
ble in an interpretable manner. As an iterative model design
can generate better results, interpretability can facilitate better Definition 1 (Dataset) D = (X̃, Ỹ ) is a dataset, where X̃
learning, in such scenarios [18]. By enforcing explainability a be an N -tuple of M -instances, X be the set of all instances in
key requirement: i) an organization will have better oversight X̃, and Ỹ be N -tuple of labels l ∈ L.
of what an AI system does and why, ii) ensures the AI system
continues to meet objectives and support in refining them to Definition 2 (Model and prediction) Let the pair
increase precision, iii) helps an organization to comply with (name, value) be a parameter and Θ be a set of parameters.
parts of GDPR, iv) and to adhere to external policies, practical A model f is a parameterized function f : X × Θ → R that
consequence, and processes that regulate business practices [53]. maps an input instance x from a feature space X to a decision
y ∈ L and returns an output called prediction ŷ. A prediction
ŷ = f (xi , θ ) is accurate for model f and parameter θ ∈ Θ if
Techniques and Methods for Interpretable ML
and only if ŷ ≈ Ỹ [i] for x = X̃ [i], where 1 ≤ i ≤ M .
A wide range of model-specific and model-agnostic interpretable
ML methods have been proposed and developed [9]. These Definition 3 (Black-box and interpretable models)
methods largely fall into 3 main categories: probing, perturbing, Let f be a model and Θ be a set of parameters. Model, f is a
and model surrogation. Further, depending on the level of black-box if its internal working principle and θ are hidden or
abstractions, they can be categorized as local interpretability uninterpretable by humans owing to lack of traceability of how
and global interpretability methods1 . We categorize these f makes predictions. Model f is interpretable if the parameters
interpretable ML methods based on papers, books, and tools. θ are known and there exists a mathematical interpretation λi
showing how and why a certain prediction ŷ is generated by f .
Terminologies and notations
Definition 4 (Algorithmic transparency) A model f is
Interpretability and explainability are interchangeably used in
transparent if there exists a mathematical interpretation λt by
literature. However, the former signifies cause and effect of a
which the learning algorithm learns model f from D by mapping
certain outcome can be determined in a system, the latter is
relations between X and Y to make predictions.
the extent to which the internal working mechanism of an AI
system can be explained. According to Cambridge Dictionary2 , Definition 5 (Feature importance) Let ai be a feature in
“if something is interpretable, it is possible to find its meaning instance x and A be the set of all features. Importance
or possible to find particular meaning in it”. Miller et al. [74] function h : A → [0, 1] assigns each element of A a
define explanation as the answer to ‘why’ questions. Das et non-negative number between 0 and 1: the larger the num-
al. [18] define interpretation as a “simplified representation of a ber, the higher the importance of ai . Local feature impor-
complex domain, such as outputs generated by an ML model, tance for x is a set of feature and importance pairs Ix =
to meaningful concepts which are human-understandable and {(a1 , h(a1 )), (a2 , h(a2 )), . . . , (aM , h(aM ))} for all ai ∈ x.
reasonable”. Interpretability, on the other hand, “is the degree Global feature importance for X is a set of feature and im-
to which a human can understand the cause of a decision” [74]. portance pairs I¯X = {(a1 , p̄1 ), (a2 , p̄2 ), . . . , (ak , p̄M )}, where
Interpretability gives an ML system the ability to explain or p̄i is the mean local feature importance of ai .
present a model’s predictions to humans in an understandable
way [24]. Less technically, the interpretability of the ML model Definition 6 (Feature impacts) Let ai be a feature of in-
is the extent to which the cause and effect can be observed. stance x, and A is the set of all feature names. Impact
If the parameters of an ML model and its architecture are function g : A → [−1, 1] takes an element of A as in-
known, the model is considered a white-box. An ML model put and results in a real number between -1 and 1. Local
is a black-box if its parameters and network architectures are feature impact for x is a set of feature and impact pairs
hidden from the end user. In contrast, an interpretable ML Tx = {(a1 , g (a1 )), (a2 , g (a2 )), . . . , (aM , g (aM ))} for all ai ∈ x.
refers to methods that make the behaviour and predictions of Global feature impact for X is a set of feature and impact pairs
1
GitHub repo: https://github.com/rezacsedu/XAI-for-bioinformatics T̄X = {(a1 , q̄1 ), (a2 , q̄2 ), . . . , (ak , q̄M )}, where q̄i is the mean
2
https://dictionary.cambridge.org/dictionary/english/interpretable of all local feature impacts of feature for ai .
Karim and Decker et al. 7

Table 1 Interpretable ML techniques, categorized based on agnosticism, scopes, underlying log, and data types
Approach Scope Agnosticism Methodology Datatype
CAM [35, 43] Local Model agnostic Back-propagation Image
Grad-CAM [88] Local Model agnostic Back-propagation Image
Grad-CAM++ [17] Local Model agnostic Back-propagation Image
Guided Grad-CAM [96] Local Model agnostic Back-propagation Image
Respond CAM [109] Local Model agnostic Back-propagation Image
LRP [10, 11] Local Model agnostic Back-propagation Image
DeepLIFT [90] Global/local Model agnostic Back-propagation Image
Prediction difference analysis (PDA) [113] Local Model agnostic Perturbation Image
Slot activation vectors [44] Global Model agnostic Back-propagation Text
Peak response mapping (PRM) [111] Local Model agnostic Back-propagation Image
LIME [84, 85] Global/local Model agnostic Perturbation Image, text, tabular
MUSE [61] Global Model agnostic Perturbation Text
DeConvolutional nets [108] Local Model agnostic Back-propagation Image
Guided back-propagation [43, 93] Local Model agnostic Back-propagation Image
Activation maximization [25] Local Model agnostic Back-propagation Image
Bayesian averaging over decision trees [87] Global Model specific Bayesian Tabular
Generative discriminative models (GDM) [16] Global Model specific Discriminative Tabular
Bayes rule lists [65] Global Model specific Rule-based Tabular
Shapley sampling [70] Global/local Model agnostic Perturbation Image, text, tabular
Gradient-based saliency maps [91] Local Model agnostic Back-propagation Image
Bayesian case model (BCM) [54] Global Model specific Bayesian Image, text, tabular
Deep Taylor decomposition [78] Local Model agnostic Decomposition Image
Deep attribution maps [6] Local Model agnostic Back-propagation Image, text
Axiomatic attributions [95] Local Model agnostic Back-propagation Image, text
PatternNet and pattern attribution [58] Local Model agnostic Back-propagation Image
Neural additive models (NAMs) [1] Global Model specific Others Image
ProtoAttend [7] Global Model agnostic Others Image
Concept activation vectors (CAVs) [55] Global Model agnostic Others Image
Global attribution mapping (GAM) [41] Global Model agnostic Perturbation Image
Randomized sampling for explanation (RISE) [82] Local Model agnostic Perturbation Image
Spectral relevance analysis (SpRAy) [62] Global Model agnostic Back-propagation Image
Salient relevance (SR) map [66, 101] Global/local Model agnostic Back-propagation Image
Automatic concept-based explanations (ACE) [28] Global Model agnostic Others Image
Causal concept effect(CaCE) [29] Global Model agnostic Others Image

Definition 7 (Top-k and bottom-k features) Let f be a outlining overall conditional interactions between dependent-
model, I be global feature importance for D, and k be an integer. and independent variables using some functions σ (·, ·). Local
The top-k features is a k-tuple such that for all i ≤ k ≤ m, interpretability e reasons why ŷ has been predicted for an in-
I [k [i]] ≥ I [k [m + 1]], and I [k [i]] ≥ I [k [i + 1]], where k is stance x, showing conditional interactions between dependent
the number of top features used to explain the decision. A variables and ŷ, focusing on individual predictions.
bottom-k features is a k-tuple such that for all i ≥ k ≥ m,
I [k [i]] ≤ I [k [m + 1]], and I [k [i]] ≤ I [k [i + 1]]3 . Definition 10 (Algorithmic fairness) An algorithm is fair
if the predictions do not favour or discriminate against certain
Definition 8 (Top-k and bottom-k impactful features) individuals or groups based on sensitive attributes.
Let f be a model, T be global feature impacts for D, and k be
an integer. The top-k features is a k-tuple such that for all Definition 11 (Explanations) An explanation e for a pre-
i ≤ k ≤ m, T [k [i]] ≥ T [k [m + 1]], and T [k [i]] ≥ T [k [i + 1]], diction ŷ = f (x) for an instance x ∈ X is an object derived
where k is the number of top features used to explain a decision. from model f using some function σ (·, ·) that reasons over f
A bottom-k impactful features is a k-tuple such that for all and x for y such that e = σ (f , x) and e ∈ E, where E is the
i ≥ k ≥ m, T [k [i]] ≤ T [k [m + 1]], and T [k [i]] ≤ T [k [i + 1]]4 . human-interpretable domain and σ is an explanation function.

Definition 9 (Interpretability) Interpretability is the de- Definition 12 (Decision rules) A decision rule r is a for-
gree to which humans can understand the cause of a deci- mula p1 ∧ p2 ∧ · · · ∧ pk → y, where pi are boolean conditions
sion [74]. Global interpretability refers to an explanation E on feature values in an instance x and ŷ is the decision. Each
explaining why model f has predicted Ŷ for all instances in X, decision rule r is evaluated w.r.t x by replacing the feature with
a feature value from x, evaluating the boolean condition and
3
Since the learning algorithm for a model involves stochasticity and the way reaching a conclusion y if the condition is evaluated to be true.
they are computed could be different, feature importance scores (p) could be
different if I is sorted in ascending or descending orders of p.
4
Since the learning algorithm for a model involves stochasticity and the way Definition 13 (W-perturbations) Let D = (X̃, Ỹ ) be a
they are computed could be different, feature impact scores (q) could be
different if T is sorted in ascending or descending orders of q. dataset, let X be the set of all instances in X̃, and let x be
8 Karim and Decker et al., Explainable AI for Bioinformatics

an instance of an m-tuple in X. Let x0 for x be a resulting features in the deepest layer to provide attention to some areas
vector by applying minimum change ∆x to some feature val- using attention maps, explaining where the model put more
ues vi using an optimization method (e.g., ADAM [100]) such attention to where the most important pixels are located at.
that y 0 = f (x0 ) against original prediction y = f (x), where
x0 = x + ∆x. Then, x0 is called a W -perturbation of x and y 0 . Saliency map and gradient-based methods saliency
maps (SMs) and gradient-based methods are used to identify
Definition 14 (Counterfactual rules) Let r : p → y be a relevant regions, followed by assigning importance to each
decision rule for an instance x and x0 is the perturbed vector feature, e.g., a pixel in an image. Guided backpropagation [93],
of x by applying W-perturbation5 . A counterfactual rule rl for class activation maps (CAM) [110], Grad-CAM/++ [17, 88],
the boolean conditions p is a rule of the form rl : p[δ ] → y 0 for and LRP are examples of this category. In GB, absolute output
y 0 = f (x0 ) s.t. y 0 6= y, where δ is a counterfactual for original gradients are shown as heat maps (HMs) and input nodes in
decision y = f (x) for model f . which negative gradients are set to zero at the rectified linear
layers of the network, thereby “rectifying” the gradient,s in the
Definition 15 (Surrogate model) Let x ∈ X be an in- backward pass leads to more focused HMs [14].
stance of an m-tuple, Ỹ be N -tuple of labels l ∈ L, and fb Class-discriminating attention map is applied to visualize
be a black-box model. Model f is a surrogate model of fb if the the weighted combination of the feature maps (FMs). To visu-
variance R2 of fb captured by f is ≈ 1. alize where a CNN architecture provides more attention, CAM
Definition 16 (Causally interpretable model) Let f be computes weights w.r.t each FM from the final convolutional
an interpretable model and q be a causal or counterfactual layer. The contribution to prediction yc isP calculated at location
question. Then, the model f is casually interpretable if there (i, j ). To obtain Lcij that satisfies y c = i,j Lcij , the last FM
exists at least one answer ŷa to question q. Aijk and prediction yc are represented in a linear relationship.
A linear layers consist of global average pooling P(GAP) and
Definition 17 (Local explanations) Let f be an inter- fully connected layer (FCL): GAP outputs Fk = i,j Aijk and
pretable model, ŷ = f (x) be the prediction for instance x, r be FCL holds weight wkc to generate the final prediction as [56]:
a decision rule for ŷ, Φ be the set of counterfactual rules for
r, and I be the set of local feature importance for ŷ. A local X X X XX
explanation e explaining decision ŷ for x is a triple (I, r, Φ). yc = wkc Fk = wkc Aijk = wkc Aijk . (1)
The domain E` of e is a N -tuple of triples (e1 , e2 , · · · , eN ), k k i,j i,j k
where N is the number of instances in X.
where Lcij = c
k wk Aijk [56]. However, if linear layers re-
P
Definition 18 (Global explanations) Let f be a model, place the classifier architecture, re-training of the network is
ŷ = f (x) be the prediction for instance x, I¯ and T̄ required and the non-linearity of the classifier vanishes. An
be the sets of global feature importances and impacts improved generalization of CAM called Grad-CAM is pro-
for the set of predictions Ŷ for instances X, I¯kt = posed [89], where instead of pooling, globally average gradients
{(a1 , q̄1t ), (a2 , q̄2t ), . . . , (ak , q̄kt )} be the set of top-k features, of FMs are used as weights for the target class c.
and I¯kb = {(a1 , q̄1b ), (a2 , q̄2b ), . . . , (ak , q̄kb )} be the set of Guided backpropagation in Grad-CAM helps generate more
bottom-k features. A global explanation eg is a pair hI¯kt , I¯kb i. human-interpretable but fewer class-sensitive visualizations
The domain Eg of eg is a pair (I, ¯ T̄ ). than SM. Since SM use true gradients, network weights are
likely to impose a stronger bias toward specific subsets of input
The methods we discuss in the following sub-sections largely pixels. Accordingly, class-relevant pixels are highlighted instead
operate by approximating the outputs of a black-box model of producing random noise [80]. Therefore, Grad-CAM uses
via tractable logic or local approximation via a linear model, HMs to provide attention on and to localize discriminating
thereby putting emphasis on transparency and traceability of image regions, where class-specific weights of each FM are
the black-box [74]. Besides, tree-, rules-, and knowledge-based collected from the final convolutional layer through globally
interpretable methods have been proposed. averaged gradients (GAG) instead of pooling [17]:
Probing black-box models 1 X X ∂y c
αkc = . (2)
Some interpretable methods do not take into account the inner Z ∂Akij
i j
working principles of black-box models. Therefore, probing tech-
niques have been proposed to probe and understand the under- where Z is the number of pixels in a FM, c is the gradient
lying logic of opaque models. Saliency and gradient-based meth- of the class and Akij is the value of kth FM [17]. Having
ods such as gradient-weighted class activation mapping (Grad- gathered the relative weights, the coarse SM Lc is computed
CAM++) [17] and layer-wise relevance propagation (LRP) [10] as the weighted sum of αkc ∗ Akij of rectified linear unit (ReLU)
are examples of of this category, where first-order gradient infor- activation and applied to a linear combination of FM [17].
mation of a black-box model is used to produce heatmaps (HMs) Since features with positive influence on the class are of interest,
indicating the relative importance of input features, e.g., rel- negative pixels that belong to other categories are discarded [89]:
evant regions can be localised with HMs w.r.t importance to X
each feature(e.g., pixels in a CT/MRI/X-Ray image). Lc = ReLU( αkc Ak ). (3)
These approaches are particularly suitable for bioimaging by i
which a convolutional neural network (CNN) can learn features
If an image contains multiple occurrences with slightly dif-
like class-discriminating pixels using edge detectors and filters
ferent orientations of the same class, several objects fade away
across convolutional layers. Thereby, it can learn more abstract
from the SMs using Grad-CAM. Due to its overlooking of sig-
5
Making changes to certain features may lead to a different outcome. nificance disparity among pixels, parts of the objects are rarely
Karim and Decker et al. 9

(b) Interpreting black-box model with perturbing


(a) Probing a black-box model

(c) Interpreting black-box model with model surrogation

Figure 7 Overview of probing, perturbing, and model surrogation interpretability methods (based on Azodi et al. [9])

Figure 8 Timeline and evolution of interpretable ML methods, covering scopes, methodology, and usage level [52]

localised by Grad-CAM. To overcome these, Grad-CAM++ is individual layer-wise nodes of the input [42], making it suitable
proposed [17], where GAG is replaced with a weighted average for imaging. For an image recognition task, LRP results in an
of the pixel-wise gradients since the weights of pixels contribute HM that indicates what pixels of the image are important for a
to the final prediction by applying the following iterators over model’s prediction. It identifies important pixels by running a
the same activation map Ak , (i, j ) and (a, b) [89]: backward pass in a CNN. The backward pass is a conservative
relevance redistribution procedure, where nodes contribute the
! most to the higher layer receive the most relevance: an image x
XX ∂y c (L)
wkc = kc
αij · ReLU is classified in a forward pass and relevance Rt is then back-
∂Akij propagated to generate a relevance map RLRP . The relevance
i j
(l )
Rn at node n in layer l is recursively calculated as [42]:
X XX
yc = wkc · Akij
k i j (4)
∂ 2 y` (l ) +(l)
an wn,m
2
∂Ak (l ) (l +1)
X
kc
αij =
ij
. Rn = Rm , (5)
P (l ) +(l)
∂ 2 yc k n ∂ 3 yc o
2 n0 an0 wn0 ,m
P P
k
2 + a b Aab 3 m
∂Aij ∂Ak ij

where it is assumed for a L + 1 layered-network that each


Figure 9, shows an example of using Grad-CAM++, which
of first L layers consists of l, 1, 2, ..., N nodes and 1, 2, .., M are
starts from taking a radiograph as input and passing to con-
the nodes in layer l + 1. The node-level relevance in the case
volutional layers before getting rectified convolutional feature
of negative values is calculated using ReLU as [42]:
maps with guided-backdrop and Grad-CAM++ to be fed into
a fully-connected softmax layer for the classification. Finally,
critical image areas are localized with heatmaps (HMs). Al-
(l ) (l ) (l ) +(l) (l ) −(l )
though CAM variants back-propagate gradients all the way up (l )
X xn wn,m − bn wn,m − hn wn,m
Rn = . (6)
to inputs, gradients are essentially propagated until the final P (l ) (l ) (l ) +(l) (l ) −(l +1)
m xn0 wn0 ,m − bn0 wn0 ,m − hn0 wn0 ,m
convolutional layer. Besides, they are architecture-specific. n0
In contrast, LRP is based on the assumption that the likeli-
hood of a class can be traced backward through a network to the Output layer’s relevance is calculated as [42]:
10 Karim and Decker et al., Explainable AI for Bioinformatics

Figure 9 Example of Grad-CAM++ showing pixel relevances using heatmaps to localize critical regions in knee radiography [51]

distil the relevant parts of the feature space [92]. TabNet [8]
( is another approach, which uses a sequential attention mecha-
(L)
(L) zt n = t, nism to choose a subset of semantically meaningful features to
Rn = (7) process at each decision step. It visualizes feature importance
0 otherwise.
and how they are combined to quantify individual feature con-
Class-discrimination regions highlighted by Grad- tributions leveraging local and global interpretability. In these
CAM++ (e.g., on CT/MRI/X-ray images) are found to approaches, attention layers (ALs) are placed as hidden layers
be less precisely localized and scattered, while LRP highlights that map real values to parts of the human-understandable
them more accurately [51]. The reason is Grad-CAM++ input space [92]. An AL can be represented as [92]:
replaces the GAG with the weighted mean instead of individual
l2 = σ W2 · α W|F | · Ω(X ) + bl1 + bl2 , (8)
 
parts, hence can highlight conjoined features more precisely.
Therefore, if the diagnosis has to provide based on microscopic
histopathology images, an AI-guided image analysis tool where α is an activation function, bli represents layerwise
could confirm the presence of breast cancer. Since class bias, and Ω represents the first network layer that maintains
discriminative areas can be localized w.r.t pixel relevance, the connection with the input features X [92]:
Grad-CAM++ and LRP can be applied to ensure diagnosis
reliability. Figure 10 shows an example of an explainable 1M
h  i
diagnosis of osteoarthritis from MRT and X-ray images. As Ω (X ) = X ⊗ softmax Wlkatt X + bklatt , (9)
k
seen, Grad-CAM++ and LRP generate reliable HMs to show k
critical image regions and precise localization while depicting
similar regions as important. where X are first used as input to a softmax-activated layer
by setting: i) its number of neurons equal to |F |, and ii) a
Attention-based probing techniques Attention mechanisms are number of attention heads representing relations between input
proposed to detect larger subsets of features. Therefore, they features to k, and Wlkatt represents a set of weights in individual
improve accuracy in a variety of language modelling tasks. The attention heads. The softmax function applied to i-th element
concept of propositional self-attention networks (PSAN) is an of a weight vector v is defined as [92]:
earlier method based on attention [99]. In PSAN, attention
heads are used, where the number of heads represents dis- exp (vi )
tinct relations between input features. Besides, the attention softmax (vi ) = P|F | (10)
mechanism is widely used in NLP, computer vision, speech i=1 exp (vi ) ,
recognition, and CV tasks. In particular, transformer language
where Wlkatt ∈ R|F |×|F | , v ∈ R|F | , and latt represents a sin-
models (TLMs) such as bidirectional encoder representations
gle AL. In an AL, an element-wise product with X is computed
from transformers (BERT) [20] apply it to represent distinct
in the forward pass to predict labels ŷ in which two consecutive
relations between input features and identify important tokens
dense layers l1 and l2 contribute to predictions, ⊗ and ⊕ are
for the next word prediction6 . Xu et al. [104] have shown that
Hadamard product- and summation across k heads. As Ω
TLMs have great potential for biomedical text mining tasks
maintains a bijection between the set of features and weights
such as named entity recognition (NER) from unstructured
of attention heads, weights for the |F | × |F | matrix represent
texts with high accuracy [5].
the relations between features. Once the training is finished,
TMLs are effective for domain-specific fine-tuning in a trans-
weights of the AL are activated using softmax as [99]:
fer learning setting. Subsequently, they become the de-facto
standard for representation learning for text mining and infor- 1M
h   i
mation extraction in NLP. Besides, the attention mechanism is Rl = softmax diag Wlkatt , (11)
k
also increasingly applied. Self-attention network (SAN) [92] is k
proposed to identify important features from datasets having a
large number of features, indicating having not enough data can where Wlkatt ∈ R|Z|×|Z| . Then, the top-k features can be
6
extracted as diagonal of Wlkatt and ranked them w.r.t. their
BERT utilizes bidirectional attention mechanism and large unsupervised cor-
pora to obtain effective context-sensitive representations [105]. respective weights and conceptualised as feature importance.
Karim and Decker et al. 11

Figure 10 Example of using Grad-CAM++ and LRP for osteoarthritis diagnosis from knee MRI and X-ray images, highlighting
critical knee regions to emphasise plus textual explanations (based on [51])

Figure 11 Example of explainable NER, where the model classifies and highlights relevant biomedical entities

Figure 12 Example of using SAN for cancer type prediction from gene expression data (based on [52])
12 Karim and Decker et al., Explainable AI for Bioinformatics

previous methods and representing the only possible consistent


and locally accurate additive feature attribution method based
on expectations [70]. SHAP is the only possible consistent and
locally accurate approach [70].
Let f be an interpretable model, xi be a feature in features
in instance x in a dataset D, and let ŷ be the prediction, for
the instance x. Then, SHAP explains the prediction ŷ by
computing the contribution of each feature xi in x w.r.t SVs.
SVs are based on coalitional game theory [15]7 and used as the
measure of feature attributions, where each feature xi acts as
a player in a coalition. SHAP models an explanation as [77]:

C
X
f z 0 = φ0 + φi zi0 , (12)

i=1

where z 0 ∈ {0, 1}C is the coalition vector8 such that the


effect of observing
 or not observing xi is calculated by setting
zi0 = 1 or zi0 = 0 , C is the maximum coalition size equal to
the number of input features M and φi ∈ R is the feature
attribution for xi or SVs. Taking into the fairness aspect, the
computation is repeated for all possible orders. The importance
φ of a feature xi is computed by comparing what model f
predicts with and without xi for all possible combinations of
Figure 13 Example of using SHAP for ranking globally M − 1 features (i.e., except for feature xi ) in x as [69].
important features w.r.t feature importance [49]
X |S|!(M − |S|! − 1)
φx i = [fx (S ∪ {i}) − fx (S )] .
Perturbing black-box model M!
S⊆N \{i}
Sensitivity analysis, game theoretic approach, and feature-based (13)
attribution methods fall in this category.
SHAP values explain the output of a function as a sum of the
Feature-based attribution methods Knowing what features are effects φi of each feature being introduced into a conditional
statistically most important to a model help achieve and gener- expectation. SHAP averages over all possible orderings for
ate human-level interpretability. Some features have a higher computing the mean SVs. If xi has no or almost zero effect on
impact than others. This notion of feature importance (ref. def- the predicted value, it is expected to produce an SV of 0. If
inition 5) can be computed as permutation feature impor- two features xi and xi+1 contribute equally to the prediction,
tance (PFI). PFI works by randomly permuting or shuffling SVs should be the same [70]. To compute global importance,
a single column in the validation dataset leaving all the other absolute SVs per feature across instances are averaged as [69].
columns intact, where a feature is considered important if and
only if the model’s accuracy drops significantly, thereby increas- Sensitivity analysis For a black-box or interpretable model f ,
ing the prediction error. A feature is considered unimportant if sensitivity analysis (SA) is used to explain a prediction ŷ based
switching its position does not significantly affect the accuracy on the model’s locally
∂ evaluated gradient or partial derivatives.
or performance of the model. Sensitivity Rxi = ∂x i
f ( x ) quantifies the importance of an
FI can be conceptualized both locally and globally: effects of input feature xi at a low level (e.g., image pixel). This measure
a feature for a single prediction, over a large number of samples, assumes that the most relevant features are those for which the
or for the overall predictions. Let xi be a feature in instance x. output is sensitive. Often HMs are plotted to visualize which
FI methods define an explanation function g : f × Rd 7→ Rd , pixels need to be changed to make the image look similar to
returns importance scores g (f , x) ∈ Rd for all features for f the predicted class. However, such an HM does not indicate
and a point of interest in x. Besides, FI can be extracted from which pixels are pivotal for a specific prediction, making them
tree-based models, e.g., DTs or tree ensembles. not suitable for quantitative evaluation and to validate globally
important features. Therefore, SA is more suitable for tabular
Game theoretic approach For a non-linear function (e.g., a data to inspect which features a model is more sensitive to.
linear model), the order in which features are observed by the
To perform SA for a tabular dataset, the value for a fea-
model matters. Scott M. et al. [70] showed that tree-based
ture xi is changed by keeping other features in the dataset
methods like DTs or tree ensembles are similar to using a single
unchanged. If any change in its value significantly impacts the
ordering defined by a tree’s decision path, yielding inconsis-
prediction, the feature is considered to have a high impact and
tency in PFI [70]. A widely used perturbation-based method
is statistically significant. For this, a new test set X̂ ∗ is often
is SHapley Additive exPlanations (SHAP). SHAP is a game
created by applying w-perturbation over feature xi , followed
theory-based approach and is based on Shapley values (SVs).
by measuring its sensitivity at the global level. To measure the
An SV is the average marginal contribution of a feature and
a way to distribute the gains to its players. Therefore, SVs 7
In game theory, a cooperative game played with competition between groups
of players known as “coalitions” as there is a possibility of external enforce-
give an idea of how to fairly distribute the payout [77]. SHAP ment of cooperative behaviour.
connects game theory with local explanations by uniting several 8
Coalition vector is a simplified feature representation for tabular data.
Karim and Decker et al. 13

Figure 14 Example of using SAN for ranking globally important features w.r.t feature impacts

change in the predictions, the mean square error (MSE) be- ing a 50 years old woman and having two highly expressed
tween actual and predicted labels are compared, i.e., sensitivity EPHA6 and NACA2 oncogenes highly mutated in your tumour
Rxi for feature xi is the difference between the MSE at original cells, you are diagnosed with breast cancer positive, with a prob-
feature space X and sampled X̂ ∗ . However, SA requires a large ability of 75.3%.”. The doctor could further explain why it is
number of calculations (i.e., N × M ; N and M are the number the case saying “the model learns attention on the patches from
of instances and features) across predictions9 . the image and localizes the malignant and normal regions, based
on which the diagnosis decision has been made”.
Tree, textual, and rules-based approaches Local interpretable model-agnostic explanations (LIME) ap-
For the cancer diagnosis example introduced earlier, let a DT proximates a black-box model via an interpretable model to
classifier is used for cancer-type prediction. As shown in fig. 15, explain the decisions locally. Anchor [86] is another rule-based
suppose for a test instance, the model predicts from the pa- method that extends LIME. It computes DRs by incrementally
tient’s GE profile that the selected patient has breast cancer adding equality conditions on antecedents w.r.t precision thresh-
with a probability of 75.3%, showing an average response (i.e., old [30]. DRs are arguably the most interpretable predictive
intercept) of +0.452. Let the patient’s gender, age, and some models as long as they are derived from intelligible features and
marker genes be the most contributing feature w.r.t feature the length of the condition is short. However, a critical draw-
impact score. However, explaining this decision by combining back of rule-based explanations could arise due to overlapping
all these concepts may not be well-understood by users from and contradictory rules. Sequential covering and Bayesian rule
everyone, e.g., patients. The reason is that explaining a decision lists are proposed to deal with these issues. Sequential covering
with plots and charts are good for exploration and discovery, iteratively learns a single rule covering the entire training data
but interpreting them for patients may be difficult. rule-by-rule, by removing data points already covered by new
rules [77], while Bayesian rule lists combine pre-mined frequent
Using a decision rule (DR), it would be easier to explain this
patterns into a decision list using Bayesian statistics [77].
decision intuitively to humans with the ability to look up the
Tree-based approaches exploit tree structures, where internal
reasons. A DR that maps an observation to a prediction, can
nodes represent feature values w.r.t boolean conditions and leaf
have several characteristics: i) follow a general structure - IF
nodes represent predicted class labels. Each branch in a DT
conditions are met THEN make a certain prediction, ii) number
represents a possible outcome, while paths from the root to
of condition - at least one feature=value statement in the con-
leaf nodes represent classification rules. The outcome of a DR
dition, with no upper limit and more statements can be added
signifies the predicted label of a leaf node, while the conjunctions
with an ‘AND’ operator, iii) single or multiple DRs - although
of conditions in IF clause correspond to different conditions in
a combination of several rules can be used to make predictions,
the path [31]. DT iteratively splits X ∗ into multiple subsets
sometimes a single DR is enough to explain the outcome. Since
w.r.t to threshold values of features at each node until each
an explanation relates the feature values of a sample to its
subset contains instances from one class only. The relation
prediction, rule-based explanations are easier to understand for
between prediction ŷi∗ and feature x∗i is defined as [22]:
humans [75]. Further, the IF-THEN structures semantically
resemble natural language and the way how humans think [77].
N
The same decision now can be translated into rules: “IF gender X
ŷi∗ = f x∗i = cj I x∗i ∈ Rj , (14)
 
= female AND age > 50 AND top gene= (EPHA6, NACA2)
AND gType = Oncogene, THEN type = BRCA”. j =1
However, explaining this to a patient may still be difficult where each sample x∗ reaches exactly one leaf node, Rj is
unless they are explained in a human-interpretable way. A the subset of the data representing the combination of rules at
simple explanation for this, in a natural language, could be each internal node, and I{.} is an identity function [22]. The
“Increased breast cancer is associated with risk for developing mean importance for xi is computed by going through all the
lymphedema, musculoskeletal symptoms, and osteoporosis. Be- splits for which xi was used and adding up how much it has
9
improved the prediction in a child node (than its parent node)
Therefore, some approaches [47] recommends making minimal changes to top-
k features only, thereby reducing the computational complexity. w.r.t decreased measure of Gini impurity IG [34]:
14 Karim and Decker et al., Explainable AI for Bioinformatics

Figure 15 Explaining a diagnosis decision to a patient is difficult unless it is interpreted simpler way: a) explaining
with the plot, b) explaining with a rule, c) explaining in natural human language [52]

Figure 16 An example showing how a tree-based model has arrived at decision (bladder urothelial carcinoma (BLCA))

Model surrogation strategies


N
X Since interpretability comes at the cost of accuracy vs. com-
IGQ = pk · ( 1 − pk ) , (15) plexity trade-off, research has suggested learning a simple inter-
k =1 pretable model to imitate a complex model [77]. A surrogate
where pk is the proportion of instances having label yk or a simple proxy model is often developed to learn a locally
in a node Q in DT. The sum of all values of importance is faithful approximation of the black-box [94]. Model surroga-
scaled to 1: a mean importance of 0 means xi is not important, tion is a model interpretation strategy, which involves training
while an importance value of 1 means xi is highly important. an inherently interpretable model for making predictions10 of
As for tree ensembles (i.e., boosted trees (GBT) or random the black-box model [77]. Depending on the complexity of the
forest (RF)), the prediction function f (x) is defined as the sum problem, the surrogate model is trained on the same data the
of individual feature contributions plus the average contribution black-box was trained on or on sampled data, as shown in fig. 17.
for the initial node in a DT and K possible class labels that Since most important features can be identified by the
change along the prediction paths after every split w.r.t IG [3]: black-box, training an interpretable model on top-k feature
space (ref. definition 7) is reasonable. Let X ∗ be a sampled (e.g.,
M
X a simplified version of the data containing top-k features only)
f (x) = cf ull + σ (x, k ), (16) data for the original dataset X and Y be the ground truths.
k =1 A surrogate model f can be trained on X ∗ and for Y . The
where cf ull is the average of training set X and M is the advantage of model surrogation is that any interpretable model
number of features. Thus, DRs can naturally be derived from can be used [77], e.g., LR, DTs or GBT or RF classifiers11 .
a root-leaf path in a DT: starting at the root and satisfying
the split condition of each decision node, it requires to traverse
10
Approximating the predictions of a black-box.
11
Although tree ensembles are complex and known to be black-boxes, decision
until a leaf node is reached, as shown in fig. 16. rules can be extracted and FI can be computed from them.
Karim and Decker et al. 15

Figure 17 Creation of interpretable surrogate model based on the multimodal black-box model and using the surrogate model to
explain the decision (grey circles with red numbers represent different steps of the process) [52]

Casual inference and contrastive explanations more human-friendly [67]. As shown in fig. 18, an interactive
ML or DNN models usually generate statistical outputs that human–AI interface can be developed to evaluate the quality of
are based on correlational rather than causal inference (i.e., explanations for an AI system [40]. A more concrete example
focus on the association instead of the causality). Let a set could be developing an explainable chatbot to enable interaction
of relevant input features x be mapped to a target variable y between humans and an AI system. Via a chatbot users can
based on established association or correlation between them. ask questions and get answers from the AI system interactively,
Humans tend to think in a counterfactual way by asking a e.g., explanations about cancer decisions made by ML models.
question such as “How would the prediction have been if input In statistical learning theory, ML models are usually trained
x had been different?”. While we would conventionally say that on data and iteratively optimized. However, except for the
features in x are correlated to y, it is not justified to say that data, no domain knowledge is incorporated. In many scientific
features in x cause y, i.e., “correlation does not imply causa- discoveries, where the goal is rather better understanding, em-
tion”. However, recent interpretable ML methods try to answer ploying sophisticated ML methods and getting high accuracy
questions related to causality such as “Was it a specific feature is not enough. Even though an ML model may be better than
that caused the diagnosis decision made by the model?”. Sta- humans at making more accurate predictions and processing
tistical interpretability aims to uncover statistical associations, patterns, it may not be able to reason or explain underlying
while causal interpretability is designed to answer what-if and decisions. Therefore, the model would not be able to act as
why questions, yielding the highest level of interpretability [79]. humans do, e.g., infer symbols, abstractions, and functional
Kim et al. [57] proposes to learn an oracle model fc to es- connections [46]. If trained on data accompanying expert knowl-
timate the causal effects covering all observed instances and edge or metadata, a well-fitted black-box model would have
learning an interpretable model f to approximate the oracle sufficient know-how of what statistically significant features it
model fc . We would argue that a black-box can be trained has learned from a domain point of view.
to learn casual effects, followed by generating an interpretable By interpreting a complex black-box model, we would know
model by employing model surrogation strategies. Lipton et what are the most important features in a dataset. The model
al.[68] argues that a causally interpretable model (ref. defini- could tell us, for our cancer diagnosis example, which oncogenes
tion 16) is often indispensable to guarantee fairness. Therefore, or protein-coding genes are most important. However, the
by using a set of DRs and counterfactuals, it is possible to model is not capable of validating if they are really of biological
explain the decisions directly to humans with the ability to significance (e.g., are all features in the antecedents biologically
comprehend the reason so that users can focus on understanding significant?). Since the generated explanations are based on
the learned knowledge instead of underlying data representa- statistical learning theory, we cannot confidently say that the
tions [75]. Local rule-based explanations (LORE) [30] is an decisions are reliable. Unless the recommendation made by
approach that learns an interpretable model by computing a an AI system is validated from the domain knowledge, such
neighbourhood using a genetic algorithm. It derives expla- unreliable decisions can disproportionately impact the diagnosis
nations from the interpretable model in order to explain the and could lead a medical doctor to make the wrong treatment.
decisions in the form of DRs and counterfactuals (ref. defini- Further, in cancerogenesis, it is also important to understand
tion 14). Partial dependence plot (PDP) is another way that the biological mechanism. An interpretable model alone cannot
depicts the marginal effect of features on the predicted out- build the foundation of a reliable and trustworthy AI system.
comes. PDP allows measuring the change in predictions after Besides, rigorous clinical validation w.r.t domain expertise is
making an intervention (w-perturbations), which can help to required before deploying an AI system in a clinical setting.
discover the features’ causal relationship. The integration of an ML model with a knowledge-based sys-
tem would allow human operators the reasoning and question-
Human-in-the-loop and knowledge-based methods answering over12 . Further, domain experts require to rely
Since an explanation is an “interface” between humans and on up-to-date knowledge and findings from scientific litera-
a user [31], interactive ML systems can add the component ture. Knowledge and facts about drugs, genes, protein, and
of human expertise to AI processes by enabling them to re- their mechanism are spread across a huge number of struc-
enact and retrace AI/ML results [40]. Therefore, the human- tured (knowledge bases) and unstructured (e.g., scientific arti-
computer interaction (HCI) community have a long-standing 12
This paradigm, which is being emerged as neuro-symbolic AI combines both
interest in algorithmic transparency towards making AI systems connectionist AI and symbolic AI paradigms together.
16 Karim and Decker et al., Explainable AI for Bioinformatics

Figure 18 Decision reasoning with rules in textual and natural human language based facts from a domain-knowledge graph [52]

cles) [48, 52]. These articles are a large-scale knowledge source entities with the concept in a domain-specific ontology, the
and can have a huge impact on disseminating knowledge about following relation triples can be generated and integrated into
the mechanisms of certain biological processes, e.g., diseases to a KG: (TP53, causes, BRCA), (TP53, hasType, POTSF),
design strategies in order to develop prevention and therapeu- (BRCA, a, Disease), (POTSF, hasEvidence, PubMed).
tics decision-making process [48]. From such structured and The full potential of an AI system can only be exploited by
unstructured sources, facts containing simple statements like integrating both domains- and human knowledge [83, 98, 112].
“TP53 is an oncogene" or quantified statements:“Oncogenes are Inference rules (IRs) are a straightforward way to provide au-
responsible for cancer" can be extracted and integrated into a tomated access to deductive knowledge [38]. An IR encodes
knowledge graph (KG) in three steps: i) NER that recognizes IF-THEN-style consequences: p → y, where both body and
the named entities scientific articles, ii) entity linking to link head follow graph patterns in KG. Reasoning over KGs, using a
extracted named entities, and iii) relation extraction. reasoner help deduce implicit knowledge from existing facts [27].
NER is recognizing domain-specific proper nouns in a For example, assuming two facts “TP53 is an oncogene; onco-
biomedical corpus. It can be performed by fine-tuning a domain- genes are responsible for cancer” are already present, reasoning
specific BERT variant such as BioBERT [63] and SciBERT [12] over the KG would entail an extended knowledge “TP53 is
on relevant articles. For example, for the abstract: “Cyclooxy- responsible for cancer”. Then, as shown in fig. 18, a doctor with
genase (COX)-2 mRNA and protein expression were found to be their own expertise and by combining the facts from the KG,
frequently elevated in human pancreatic adenocarcinomas and can explain the decision and provide further interpretation (e.g.,
cell lines derived from such tumours. Immunohistochemistry biomarkers and their relevance w.r.t specific cancer types) [52].
demonstrated cytoplasmic COX-2 expression in 14 of 21 (67%) Further, Kim et al. [102] has shown that a neuro-symbolic sys-
pancreatic carcinomas. The level of COX-2 mRNA was found tem could be an effective means to generate casual reasoning if
to be elevated in carcinomas, relative to the histologically nor- the KG contains sufficient knowledge.
mal pancreas from a healthy individual, as assessed by reverse
transcription-PCR.”, an interpretable BioBERT [64]-based NER Measure of interpretability and explainability
model [4] would be able to recognize named entities classified
AI explainability should have rigorous metrics to satisfy dif-
as diseases, chemical or genetic, as shown in, where named
ferent needs in biomedical applications [33]. However, when
entities are highlighted with different HMs.
it comes to providing qualitative or quantitative measures of
These facts can be extracted as a triplet fact in the form of
explainability, there is little consensus on what interpretability
(u, e, v ) = (subject, predicate, object ), where each triple forms in ML is and how to evaluate it for benchmarking. Current
a connected component of a sentence for the KG, e.g., for the
interpretability evaluation typically falls into two categories.
sample text: “TP53 is responsible for a disease called BRCA13 .
The first category evaluates explainability in a quantifiable way:
TP53 has POTSF14 functionality, which is mentioned in nu-
a domain user/expert first claims that some model family, e.g.,
merous PubMed articles”, TP53, disease, BRCA, POTSF, and
linear models, rule lists, RF, GBT are interpretable and then
PubMed are the extracted named entities. Then, by linking
presents algorithms to optimize within that family [74]. The
13
Breast cancer second category evaluates the explainability of an ML model in
14
Proto-oncogenes with tumour-suppressor function. the context of its applications from a qualitative point of view.
Karim and Decker et al. 17

These approaches rely on the notion of “you will know it the number of extracted features (e.g., top-k genes) divided
when you see it”, where interpretability meet the first test of by the number of features in x. A prediction ŷ is considered a
having face validity on a correct test set of subjects: human match if the overlap with any of the ground truth rationales
operators. However, this is too naïve, leaving many ques- ri ≥ 0.25. A high value of comprehensiveness implies that the
tions unanswerable, e.g., “are all models in all defined-to-be- rationales were influential w.r.t prediction ŷ. If an AI system
interpretable”? or “are all model classes equally interpretable?”. is useful, then it must somehow be interpretable and able to
However, answers to these questions can only be realized in provide human-understandable explanations [13]. Hoffman et
terms of some metrics that can be used to evaluate the quality al. [37] proposed following metrics to evaluate an AI system:
of explanations by an AI system. Kusner et al. [60] proposed a
metric for measuring how fair decisions w.r.t counterfactuals. • The goodness of the explanations
A decision ŷ is fair for instance x if the prediction is the same • Whether users are satisfied by explanations
for both the actual world and the counterfactual world in which • How well users understand the AI systems
the instance belonged to a different demographic group. • How curiosity motivates the search for explanations
Since surrogate models are often used to explain the deci- • Whether the user’s trust on the AI is appropriate
sions, the quality of the explanations depends on the surrogate’s • How the human-XAI work system performs.
predictive power [47]. To measure how well a surrogate f model
System casuability scale (SCS) [40] is another measure based
has replicated a black-box fb , R-squared measure (R2 ) is calcu-
on the notion of causability [39]. It is proposed to quickly deter-
lated as an indicator of goodness-of-fit. It is conceptualized as
mine whether and to what extent an explainable user interface,
the percentage of variance of fb captured by the surrogate [77]:
an explanation, or an explanation process itself is suitable for
∗ PN 2 the intended purpose [40]. SCS combines with concepts adapted
SSE ∗ =1 (ŷi − ŷi )
2
R = 1− = 1 − PiN , (17) from a widely accepted usability scale and SCS is computed
SSE (ŷi − ŷ )2i=1 based on responses from ten usability questionnaires listed in
where ŷi∗ is the prediction for f , ŷi is the prediction for fb table 2. SCS system uses a 5 point scale: rating 1 = strongly
for X ∗ , and SSE and SSE ∗ are the sum of squares errors for disagree; 2 = disagree; 3 = neutral; 4 = agree; 5 = strongly
f and fb , respectively [77]. The surrogate model f can be used agree. SCS is measured by dividing the acquired ratings by the
P10
instead of the black-box fb , based on R2 value: total ratings, i.e., SCS = i=1 Ratingi /50.

• If R2 is close to 1 (low SSE), the surrogate model ap- Interpretable ML frameworks and libraries
proximates the behavior of the black-box model very well.
Hence, fb can be used instead of f . Following the timeline of XAI methods (ref. fig. 8), numer-
• If R2 is close to 0 (high SSE), the surrogate fails to ap- ous frameworks and libraries are developed that are based
proximate the black-box, hence cannot replace fb . on interpretable ML methods such as LIME [84], model
understanding through subspace explanations (MUSE) [61],
Comprehensiveness and sufficiency that is proposed to mea- SHAP [70] (its variants kernel SHAP and tree SHAP), PDP,
sure the quality of explanations in NLP. These metrics are individual conditional expectation (ICE), permutation feature
based on the concept of rationales introduced by Zaidan et importance (PFI), and counterfactual explanations (CE) [100].
al. [106] in NLP: where human annotators would highlight a We provide a list of some widely used interpretable tools, li-
part of the texts that could support their labelling decision15 . braries, and links to their GitHub repositories:
This concept is found useful across NLP tasks like sentiment • DeepLIFT: https://github.com/kundajelab/deeplift
analysis [36]. Rational explanation is key to understanding an • Xplique: https://github.com/deel-ai/xplique/

AI system since translating the rationales results into usable •
DALEX: https://dalex.drwhy.ai/python/
Alibi: https://github.com/SeldonIO/alibi
and easily understandable formats. Let f (xi )c be the original • SHAP: https://github.com/slundberg/shap
• LIME: https://github.com/marcotcr/lime
prediction probability for model f for class c. To measure the • ELI5: https://github.com/TeamHG-Memex/eli5
sufficiency, they proposed to create contrasting example x̃i for • InterpretML: https://github.com/interpretml/interpret
• Interpret-Text: https://github.com/interpretml/interpret-text
each sample xi , by removing predicted rationales ri as [21]: • ExplainerDash: https://github.com/oegedijk/explainerdashboard
• CNN viz: https://github.com/utkuozbulak/pytorch-cnn-visualizations
• iNNvestigate: https://github.com/albermax/innvestigate
• DeepExplain: https://github.com/marcoancona/DeepExplain
s = f ( xi ) c f ( r i ) c , (18) • Lucid: https://github.com/tensorflow/lucid
• TorchRay: https://facebookresearch.github.io/TorchRay/
• Captum: https://captum.ai/
where s measures the degree to which extracted rationales • AIX360: https://github.com/Trusted-AI/AIX360
are adequate for the model f for making the prediction [21]. Let • BERTViz: https://github.com/jessevig/bertviz
• Bio-NER: https://librairy.github.io/bio-ner/.
f (xi \ri )c be the predicted probability of x̃i (= xi \ri ). Thus, it
is expected the prediction to be lower on removing the rationales, These tools and libraries are developed with a view to im-
where comprehensiveness e is calculated as [21]: proving the interpretability and explainability of black-box ML
models, covering general-purpose problems. In particular, the
majority of these tools are developed for computer vision, text
e = f (xi )c − f (xi \ri )c . (19)
mining, or structured data, hence are not specialized to tackle
The same can be conceptualized for our cancer example w.r.t bioinformatics/biomedical problems by default. Further, not
leave-one-feature-out: the rationale can be computed based on each form (e.g., image-, tree-, rule-, evidence-, or text-based)
15
of explanations can be generated using a single tool. Therefore,
For example, to justify why a review about a movie is positive, an annotator
can highlight the most important words and phrases that would tell someone
these tools may need to be customized or even multiple tools
why to see the movie. While to justify why a review is negative, the annotator may need to be combined, in order to develop XAI applications
can highlight words and phrases that would recommend to someone why not
to see the movie. for bioinformatics by handling diverse data types.
18 Karim and Decker et al., Explainable AI for Bioinformatics

Table 2 SCT and it’s usability questionnaires with interpretation (adopted from [40])
Usability question Interpretation
Factors in data Data included all relevant known causal factors with sufficient precision and granularity
Understood Explanations are within the context of my work
Change detail level Level of detail can be changed on demand
Understanding causality No support was necessary to understand the explanations
Use with knowledge Explanations helped me understand the causality
No inconsistencies Explanations could be used with my knowledge base
Learn to understand No inconsistencies between explanations
Needs references Additional references were not necessary in the explanations
Efficient Explanations were generated in a timely and efficient manner.

Conclusions and outlook limit detail such as feature weightings or importance, as well
as carefully assess this risk as part of data protection.
In this paper, we discussed the importance of interpretability Fourth, AI systems may be vulnerable to imperceptible
in bioinformatics and provided a comprehensive overview of attacks (e.g., adversarial attacks), biased against underrepre-
interpretable ML methods. Interpretability is a key to gener- sented groups (e.g., lack of fairness in decision-making), and
ating insights on why and how a certain prediction has been lack individual privacy protection that not only degrades user
made by the model (e.g., most important biomarkers exhibit experience but also erodes society’s trust. Therefore, an AI sys-
shared characteristics while clustering patients having a certain tem should be robust to adversaries, which can only be achieved
cancer type, which can help in recommending more accurate by taking reactive and proactive measures against adversarial
treatments and drug repositioning). Through several examples attacks. Since an adversarially robust model can generate con-
covering bioimaging, genomics, and biomedical text mining, we sistent and reliable predictions, the robustness property of an
showed how bioinformatics research could be benefited from AI system may need to be formulated such that the predictions
interpretability and different means of explanations (e.g., rules, remain stable to small variations in inputs (e.g., adding minor
plots, heatmaps, textual, and knowledge-based). Besides, we noise to inputs should not flip the prediction to completely dif-
analyzed existing interpretable ML tools and libraries that can ferent predictions). Fifth, an AI system should not only be able
be employed to improve interpretability while solving complex to provide meaningful information (e.g., clinical outcomes) and
bioinformatics research problems and use cases. supporting explanations in a human-interpretable form outlining
Although interpretability could contribute to trustworthy why and how ML models reach a decision, but insights (e.g.,
AI [57], the benefits of explainability still need to be proven treatment recommendations). Sixth, when using interpretable
in practical settings [53]. We argue that only interpretability models, the biological relevance of top-ranked features (e.g.,
alone cannot make an AI system fully reliable and trustworthy. cancer-specific biomarkers) needs to be validated both clinically
There are other important considerations and potential chal- and based on domain knowledge (e.g., oncologists can combine
lenges in the development and deployment phases – especially their expertise with evidence from a domain knowledge graph).
for mission-critical application scenarios (e.g., clinical). First, In this regard, EU guideline on robustness and explainability
before building the interpretable model, decision-makers and of AI [32] emphasizes three aspects for the right deployment of
domain experts should discuss: i) what kind of data to be used - AI: transparency, reliability, and individual data protection in
imaging, texts, tabular, graphs?, ii) what types of explanations AI models. We believe the techniques and methods discussed in
need to provide, e.g., visual-, tree-, rule-based, or textual?, this paper provide valuable insights into interpretable ML and
iii) how could the full potentials of global interpretability be XAI in general. We hope domain experts (e.g., doctors, data
achieved in order to tailor and generalize the model better for scientists), lay persons (e.g., patients), and stakeholders will be
unseen data?, and iv) how could the local interpretability be benefited too in order to accelerate bioinformatics research.
used if the model fails (e.g., wrong diagnosis) so the model can
be diagnosed before re-training? Besides, it is important to
outline the decision-making process and potential impacts the Key points:
model could have on the targeted users [53]. • Post-hoc or ante-hoc interpretable methods? Since
Second, one of the first steps to improving an AI system is linear models are easy to interpret and comprehend, the
to understand its weaknesses. The better we understand which first choice should be building and explaining using inher-
factors cause a model to make what prediction or which factors ently interpretable models, in an ante-hoc manner. How-
caused it to make wrong predictions, the easier it becomes to ever, depending on modelling complexity, a simple or surro-
improve it [74]. However, such weakness analysis on black-box gate model may not be able to fully capture the behaviour
or even interpretable models is not straightforward [13], but of a black-box model. Consequently, it could lead to wrong
requires close monitoring and debugging by zooming individ- decisions – especially if the model surrogation process is not
ual data points. Third, as outlined in co-badged guidance of thoroughly evaluated [47]. Therefore, it is recommended
XAI [53], explanations by AI-assisted decisions could poten- to build a black-box model, followed by embedding inter-
tially disclose commercially sensitive information as well as how pretable ML logic into it and using the interpretable model
an ML model works. Besides, there are potential risks with the to explain the decisions. The latter approach can combine
rationale, fairness and data explanation types – information on both ante-hoc and post-hoc approaches in a single pipeline.
how other similar individuals were treated and detail on input • Local or global explainability? Global explainability
data for a particular decision. Therefore, it is important to is important for monitoring and having a holistic view of
Karim and Decker et al. 19

a model’s interpretability (e.g., what features are more im- • Autoencoders (AEs)
portant to the model?). Local explainability is important • Bayesian rule lists (BRL)
for individual instances without providing a general under- • Bayesian case Model (BCM)
• Convolutional neural network (CNN)
standing of the model, e.g., what factors contributed most • Concept activation vectors (CAVs)
for a certain prediction?. Further, in case of wrong pre- • Counterfactual explanations (CE)
dictions, local explainability helps in diagnosing why the • Contrastive layer-wise relevance propagation (CLRP)
model has made wrong predictions for certain instances. • Convolutional autoencoder (CAE)
• Class activation mapping (CAM)
• What ML models/methods and explanations • Causal concept effect(CaCE)
types? The choice depends on problems, requirements, • Casual inferencing (CI)
data types, as well as target users (decision recipients). • Deep learning (DL)
For example, decision rules and counterfactuals are more • Deep neural networks (DNNs)
effective at providing intuitive explanations compared to • Decision rules (DRs)
• Decision tree (DT)
visualization-based explanations. The former is suitable for • Explainable artificial intelligence (XAI)
tabular data, while the latter is for imaging [77]. Further, • Feature maps (FMs)
since one-size-may-not-fit-all, several explanation types out • General data protection regulation (GDPR)
of visual-, tree-, rule-, or text-based may need to be used. • Generative discriminative models (GDM)
• Global attribution mapping (GAM)
An AI system needs to build such that relevant information
• Gene expression (GE)
can be extracted for a range of explanation types [47]. • Guided back-propagation (GB)
• Accuracy or explainability? Since achieving mediocre • Gradient-weighted class activation mapping (Grad-CAM++)
performance is not acceptable in many critical situations, • Global average pooling (GAP)
accuracy and robustness are the two most important fac- • Globally averaged gradients (GAG)
• Gradient boosted trees (GBT)
tors. However, depending on the complexity level of the • Graph neural networks (GNNs)
problem at hand, data types and volume, balancing the • Heat maps (HMs)
performance vs. explainability trade-off could be difficult. • Isometric mapping (Isomap)
As long as explainability is not critical, solving a complex • Individual conditional expectation (ICE)
problem and building an efficient model is more important. • K-nearest neighbour (KNN)
• Layer-wise relevance propagation (LRP)
• Can explainability help mitigate unfairness in the • Local interpretable model-agnostic explanations (LIME)
decision-making process? Interpretability are two im- • Local rule-based explanations (LORE)
portant properties that can help identify sensitive features • Long short-term memory (LSTM)
that could drive a model to generate unfair decisions. This • Machine learning (ML)
• Model-agnostic counterfactual explanation (MACE)
help takes proactive measures to ensure a decision would • Model understanding through subspace explanations (MUSE)
not reflect discriminatory behaviour toward certain groups • Natural language processing (NLP)
or populations because of such attributes. • Neural additive models (NAMs)
• Is explainability always necessary? Higher inter- • Principal component analysis (PCA)
pretability means easier comprehension and easier explain- • Partial dependence plots (PDP)
• Peak response mapping (PRM)
ing the predictions to target users [94]. However, all predic- • Propositional self-attention networks (PSAN)
tions may not require to explain. Solving AI learning secu- • Prediction difference analysis (PDA)
rity or fixing learning flaws could be more important than • Permutation feature importance (PFI)
building an interpretable model, in application domains • Random forest (RF)
• Rectified linear unit (ReLU)
like disease diagnosis in translational bioinformatics [33].
• Saliency maps (SM)
• Human-in-the-loop and domain knowledge? HCI • Self-attention network (SAN)
researchers should closely collaborate with domain ex- • System causability scale (SCS)
perts (e.g., data scientists, and doctors) – especially during • Sequential covering (SC)
the development phase and utilize only state-of-the-art • SHapley Additive exPlanations (SHAP)
• Sensitivity analysis (SA)
algorithms and interpretable ML methods. Such collabora- • Salient relevance (SR)
tions would help the AI developers design question-driven • Spectral relevance analysis (SpRAy)
user interfaces capable of asking “why”, “how”, and “what- • Shapely values (SVs)
if” types of questions by the human operators from diverse • Singular value decomposition (SVD).
domains and generate casual and contrastive explanations.
Author details
Conflict of Interest
Md. Rezaul Karim is a Senior Data Scientist at Fraunhofer
No conflict of interest to declare for this article. FIT and a Postdoctoral Researcher at RWTH Aachen
University, Germany. Previously, he worked as a Machine
Acknowledgment Learning Researcher at Insight Centre for Data Analytics,
University of Galway, Ireland. Before that, he worked as a
This article is loosely based on Ph.D. dissertation titled “Inter-
Lead Software Engineer at Samsung Electronics, South Korea.
preting black-box machine learning models with decision rules
He received his PhD from RWTH Aachen University, Germany;
and knowledge graph reasoning” [52] by the first author.
M.Eng. degree from Kyung Hee University, South Korea; and
B.Sc. degree from the University of Dhaka, Bangladesh. His
Abbreviations and Acronyms research interests include machine learning, knowledge graphs,
• Attention layers (ALs) and explainable AI (XAI) with a focus on bioinformatics.
• Artificial intelligence (AI)
20 Karim and Decker et al., Explainable AI for Bioinformatics

Tanhim Islam is a Data Scientist at CONET Solutions Semantic Web and linked data, and knowledge representation.
GmbH, Germany. He received his BSc. degree in Computer
Science from the BRAC University, Bangladesh and MSc. References
from RWTH Aachen University, Germany. His research inter-
ests include applied machine learning, NLP, and explainable AI. [1] Agarwal R, Melnick L, Frosst N, et al. (2021). Neural
additive models: Interpretable machine learning with neural
Oya Beyan is a Professor of medical informatics at University nets. Advances in Neural Information Processing Systems,
of Cologne, Faculty of Medicine, and University Hospital 34, 4699–4711.
Cologne. Prof. Beyan is the director of Institute for Biomedical [2] Aggarwal C. C and Reddy C. K (2014). Data clustering:
Informatics which aims to support data-driven medicine and algorithms and applications. CRC press.
digital transformation in healthcare. Prof. Beyan’s research [3] Al-Obeidat F, Rocha Á, Akram M, et al. (2021). (cdrgi)-
team focuses on health data reusability, semantic interoper- cancer detection through relevant genes identification. Neural
ability, and data science towards continuous improvement of Computing and Applications, pages 1–8.
healthcare through innovation and creating new knowledge. [4] Alonso Casero Á (2021). Named entity recognition and
Prof. Beyan co-leads the MEDIC data integration center at normalization in biomedical literature: a practical case in
Medical Faculty of Cologne, where multi-modal health data is SARS-CoV-2 literature. Ph.D. thesis, ETSI_Informatica.
integrated and reused for research. She is also affiliated with [5] Anantharangachar R, Ramani S, and Rajagopalan S (2013).
Fraunhofer FIT, where she leads a research and innovation Ontology guided information extraction from unstructured
group for FAIR data and distributed analytics. text. arXiv preprint arXiv:1302.1335 .
[6] Ancona M, Ceolini E, Öztireli C, and Gross M (2017).
Towards better understanding of gradient-based attribu-
Christoph Lange is the Head of Data Science and Artificial
tion methods for deep neural networks. arXiv preprint
Intelligence department at Fraunhofer FIT, Germany, and a
arXiv:1711.06104 .
Senior Researcher at RWTH Aachen University. He received
[7] Arik S. O and Pfister T (2019). Protoattend:
his Ph.D. in computer science from Jacobs University Bremen,
Attention-based prototypical learning. arXiv preprint
Germany. His research is broadly concerned with knowledge
arXiv:1902.06292 .
engineering and data infrastructures: choosing the right
[8] Arık S. O and Pfister T (2021). TabNet: Attentive in-
formalism domain to represent domain knowledge in an
terpretable tabular learning. In AAAI , volume 35, pages
expressive and scalable way following Linked Data and FAIR
6679–6687.
principles, thus enabling and facilitating machine support with
[9] Azodi C. B, Tang J, and Shiu S.-H (2020). Opening the
sharing, publishing, and collaborative authoring.
black box: Interpretable machine learning for geneticists.
Trends in Genetics.
Michael Cochez is an Assistant Professor at Vrije Uni- [10] Bach S, Binder A, Montavon G, et al. (2015a). On pixel-
versiteit Amsterdam, the Netherlands. He was formerly a wise explanations for non-linear classifier decisions by layer-
Senior Researcher at Fraunhofer FIT, Germany. He received wise relevance propagation. PloS one, 10(7).
his Ph.D. and M.Sc. degrees in Mathematical Information [11] Bach S, Binder A, Montavon G, et al. (2015b). On pixel-
Technology from the University of Jyväskylä, Finland, and wise explanations for non-linear classifier decisions by layer-
B.Sc. degree in Information Technology from the University of wise relevance propagation. PloS one, 10(7), e0130140.
Antwerp, Belgium. His research interests include knowledge [12] Beltagy I, Lo K, and Cohan A (2019). Scibert: Pretrained
representation, deep learning, machine learning, and question language model for scientific text. In EMNLP.
answering over knowledge graphs. [13] Bhatt U, Xiang A, Sharma S, et al. (2020). Explainable
machine learning in deployment. In Proceedings of the 2020
Dietrich Rebholz-Schuhmann is a Professor of Medicine Conference on Fairness, Accountability, and Transparency,
at the University of Cologne and the Scientific Director of pages 648–657.
the Information Center for Life Sciences, German National [14] Böhle M, Eitel F, Weygandt M, and Ritter K (2019).
Library of Medicine (ZB MED), Germany. He was formerly Layer-wise relevance propagation for explaining deep neural
the Director of the Insight Centre for Data Analytics, and network decisions in mri-based alzheimer’s disease classifica-
Professor of Informatics at the University of Galway, Ireland. tion. Frontiers in aging neuroscience, 11, 194.
Before that, he was the Director of Healthcare IT at a [15] Branzei R, Dimitrov D, and Tijs S (2008). Models in
Heidelberg, Germany-based bioscience company and Group cooperative game theory, volume 556. Springer Science &
Leader of Semantic Data Analytics at European Bioinformatics Business Media.
Institute (EMBL-EBI). His research interests include data [16] Caruana R, Lou Y, Gehrke J, et al. (2015). Intelligible
science, semantics-driven data analytics, biomedical text models for healthcare: Predicting pneumonia risk and hos-
mining, and bioinformatics. pital 30-day readmission. In Proceedings of the 21th ACM
SIGKDD international conference on knowledge discovery
Stefan Decker is the Chair and Professor for Information and data mining, pages 1721–1730.
Systems and Databases at RWTH Aachen University and Man- [17] Chattopadhay A and Sarkar A (2018). Grad-CAM++:
aging Director of Fraunhofer FIT, Germany. He was formerly Generalized gradient-based visual explanations for deep con-
the Director of the Insight Centre for Data Analytics, and volutional networks. In Conf. on Applications of Computer
Professor of Informatics at University of Galway, Ireland. Even Vision(WACV), pages 839–847. IEEE.
before that, he worked as Assistant Professor at the Univer- [18] Das A and Rad P (2020). Opportunities and challenges
sity of Southern California and as a Postdoctoral Researcher in explainable artificial intelligence (xai): A survey. arXiv
at Stanford University, USA. His research interests include preprint arXiv:2006.11371 .
Karim and Decker et al. 21

[19] Davenport T and Kalakota R (2019). The potential for Discovery, 9(4), e1312.
artificial intelligence in healthcare. Future healthcare journal, [40] Holzinger A, Carrington A, and Müller H (2020). Measur-
6(2), 94. ing the quality of explanations: the system causability scale
[20] Devlin J, Chang M.-W, and Lee K. e. a (2018). Bert: (scs). KI-Künstliche Intelligenz, pages 1–6.
Pre-training of deep bidirectional transformers for language [41] Ibrahim M, Louie M, Modarres C, and Paisley J (2019).
understanding. arXiv preprint arXiv:1810.04805 . Global explanations of neural networks: Mapping the land-
[21] DeYoung J, Jain S, Rajani N. F, et al. (2019). Eraser: scape of predictions. In Proceedings of the 2019 AAAI/ACM
A benchmark to evaluate rationalized nlp models. arXiv Conference on AI, Ethics, and Society, pages 279–287.
preprint arXiv:1911.03429 . [42] Iwana B. K, Kuroki R, and Uchida S (2019). Explaining
[22] Di Castro F and Bertini E (2019). Surrogate decision tree convolutional neural networks using softmax gradient layer-
visualization. In IUI Workshops. wise relevance propagation. arXiv preprint arXiv:1908.04351 .
[23] Diakopoulos N and Koliska M (2017). Algorithmic trans- [43] Izadyyazdanabadi M, Belykh E, Cavallo C, et al. (2018).
parency in the news media. digital journalism 5, 7 (2017), Weakly-supervised learning-based feature localization for
809–828. confocal laser endomicroscopy glioma images. In Interna-
[24] Doshi-Velez F and Kim B (2017). Towards a rigorous tional conference on medical image computing and computer-
science of interpretable machine learning. arXiv preprint assisted intervention, pages 300–308. Springer.
arXiv:1702.08608 . [44] Jacovi A, Shalom O. S, and Goldberg Y (2018). Under-
[25] Erhan D, Courville A, and Bengio Y (2010). Understand- standing convolutional neural networks for text classification.
ing representations learned in deep architectures. Technical arXiv preprint arXiv:1809.08037 .
Report. [45] Kaminski M. E (2019). The right to explanation, explained.
[26] Fournier Q and Aloise D (2019). Empirical comparison be- Berkeley Tech. LJ , 34, 189.
tween autoencoders and traditional dimensionality reduction [46] Kapanipathi P, Abdelaziz I, Ravishankar S, et al. (2020).
methods. In 2019 IEEE Second International Conference on Question answering over knowledge bases by leveraging se-
Artificial Intelligence and Knowledge Engineering (AIKE), mantic parsing and neuro-symbolic reasoning. arXiv preprint
pages 211–214. IEEE. arXiv:2012.01707 .
[27] Futia G and Vetrò A (2020). On the integration of knowl- [47] Karim M, Shajalal M, Graß A, et al. (2022a). Interpreting
edge graphs into deep learning models for a more comprehen- black-box machine learning models for high dimensional
sible ai—three challenges for future research. Information, datasets. arXiv preprint arXiv:2208.13405 .
11(2), 122. [48] Karim M, Ali H, Das P, et al. (2022b). Question answering
[28] Ghorbani A, Wexler J, Zou J. Y, and Kim B (2019). To- over biological knowledge graph via amazon alexa. arXiv
wards automatic concept-based explanations. Advances in preprint arXiv:2210.06040 .
Neural Information Processing Systems, 32. [49] Karim M. R, Cochez M, Beyan O, et al. (2019). OncoNe-
[29] Goyal Y, Feder A, Shalit U, and Kim B (2019). Explaining tExplainer: explainable predictions of cancer types based
classifiers with causal concept effect (cace). arXiv preprint on gene expression data. In 2019 IEEE 19th International
arXiv:1907.07165 . Conference on Bioinformatics and Bioengineering (BIBE),
[30] Guidotti R, Monreale A, Ruggieri S, et al. (2018a). Local pages 415–422. IEEE.
rule-based explanations of black box decision systems. arXiv [50] Karim M. R, Beyan O, Zappa A, et al. (2020). Deep
preprint arXiv:1805.10820 . learning-based clustering approaches for bioinformatics.
[31] Guidotti R, Monreale A, Ruggieri S, et al. (2018b). A Briefings in Bioinformatics.
survey of methods for explaining black box models. ACM [51] Karim M. R, Jiao J, Döhmen T, et al. (2021). DeepKnee-
computing surveys (CSUR), 51(5), 1–42. Explainer: explainable knee osteoarthritis diagnosis from
[32] Hamon R, Junklewitz H, and Sanchez I (2020). Robustness radiographs and magnetic resonance imaging. IEEE Access,
and explainability of artificial intelligence. Publications Office 9, 39757–39780.
of the European Union. [52] Karim M. R, Rebholz-Schuhmann D, and Decker S (2022c).
[33] Han H and Liu X (2022). The challenges of explainable ai Interpreting black-box machine learning models for high
in biomedical data science. dimensional datasets.
[34] Hatwell J, Gaber M. M, and Azad R. M. A (2020). Chirps: [53] Kazim E and Koshiyama A (2020). Explaining decisions
Explaining random forest classification. Artificial Intelligence made with ai: a review of the co-badged guidance by the ico
Review, 53, 5747–5788. and the turing institute. Available at SSRN 3656269 .
[35] He K, Zhang X, Ren S, and Sun J (2016). Computer vision [54] Kim B, Rudin C, and Shah J. A (2014). The bayesian
and pattern recognition. CVPR. case model: A generative approach for case-based reasoning
[36] Herrewijnen E, Nguyen D, Mense J, and Bex F (2021). and prototype classification. Advances in neural information
Machine-annotated rationales: Faithfully explaining text processing systems, 27.
classification. [55] Kim B, Wattenberg M, Gilmer J, et al. (2018). Inter-
[37] Hoffman R. R, Mueller S. T, Klein G, and Litman J (2018). pretability beyond feature attribution: Quantitative testing
Metrics for explainable ai: Challenges and prospects. arXiv with concept activation vectors (tcav). In International con-
preprint arXiv:1812.04608 . ference on machine learning, pages 2668–2677. PMLR.
[38] Hogan A, Blomqvist E, Cochez M, et al. (2020). Knowledge [56] Kim B. J, Koo G, Choi H, and Kim S. W (2020). Extending
graphs. arXiv preprint arXiv:2003.02320 . class activation mapping using gaussian receptive field. arXiv
[39] Holzinger A, Langs G, Denk H, et al. (2019). Causability preprint arXiv:2001.05153 .
and explainability of artificial intelligence in medicine. Wi- [57] Kim C and Bastani O (2019). Learning interpretable mod-
ley Interdisciplinary Reviews: Data Mining and Knowledge els with causal guarantees. arXiv preprint arXiv:1901.08576 .
22 Karim and Decker et al., Explainable AI for Bioinformatics

[58] Kindermans P.-J, Schütt K. T, Alber M, et al. (2017). [78] Montavon G, Lapuschkin S, Binder A, et al. (2017). Ex-
Learning how to explain neural networks: Patternnet and plaining nonlinear classification decisions with deep taylor
patternattribution. arXiv preprint arXiv:1705.05598 . decomposition. Pattern recognition, 65, 211–222.
[59] Kourou K, Exarchos T. P, Exarchos K. P, et al. (2015). [79] Moraffah R, Karami M, Guo R, et al. (2020). Causal
Machine learning applications in cancer prognosis and pre- interpretability for machine learning-problems, methods and
diction. Computational and structural biotechnology journal, evaluation. ACM SIGKDD Explorations Newsletter, 22(1),
13, 8–17. 18–33.
[60] Kusner M. J, Loftus J, Russell C, and Silva R (2017). [80] Nie W, Zhang Y, and Patel A (2018). A theoretical ex-
Counterfactual fairness. Advances in neural information planation for perplexing behaviors of backpropagation-based
processing systems, 30. visualizations. arXiv preprint arXiv:1805.07039 .
[61] Lakkaraju H, Kamar E, Caruana R, and Leskovec J (2019). [81] Obermeyer Z, Powers B, Vogeli C, and Mullainathan S
Faithful and customizable explanations of black box models. (2019). Dissecting racial bias in an algorithm used to manage
In Proceedings of AAAI/ACM Conference on AI, Ethics, and the health of populations. Science, 366(6464), 447–453.
Society, pages 131–138. [82] Petsiuk V, Das A, and Saenko K (2018). Rise: Randomized
[62] Lapuschkin S, Wäldchen S, Binder A, et al. (2019). Un- input sampling for explanation of black-box models. arXiv
masking clever hans predictors and assessing what machines preprint arXiv:1806.07421 .
really learn. Nature communications, 10(1), 1–8. [83] Rajabi E and Etminani K (2022). Knowledge-graph-based
[63] Lee J, Yoon W, Kim S, et al. (2019). BioBERT: a pre- explainable ai: A systematic review. Journal of Information
trained biomedical language representation model for biomed- Science, page 01655515221112844.
ical text mining. Bioinformatics. [84] Ribeiro M, Singh S, and Guestrin C (2019). Local inter-
[64] Lee J, Yoon W, Kim S, et al. (2020). Biobert: a pre-trained pretable model-agnostic explanations (LIME): An introduc-
biomedical language representation model for biomedical text tion.
mining. Bioinformatics, 36(4), 1234–1240. [85] Ribeiro M. T, Singh S, and Guestrin C (2016). Why should
[65] Letham B, Rudin C, McCormick T. H, and Madigan D i trust you? explaining the predictions of any classifier.
(2015). Interpretable classifiers using rules and bayesian In Proceedings of the 22nd ACM SIGKDD international
analysis: Building a better stroke prediction model. The conference on knowledge discovery and data mining, pages
Annals of Applied Statistics, 9(3), 1350–1371. 1135–1144.
[66] Li H, Tian Y, Mueller K, and Chen X (2019). Beyond [86] Ribeiro M. T, Singh S, and Guestrin C (2018). An-
saliency: understanding convolutional neural networks from chors: High-precision model-agnostic explanations. In Thirty-
saliency prediction on layer-wise relevance propagation. Im- Second AAAI Conference on Artificial Intelligence.
age and Vision Computing, 83, 70–86. [87] Schetinin V, Fieldsend J. E, Partridge D, et al. (2007).
[67] Liao Q. V, Pribić M, Han J, et al. (2021). Question-driven Confident interpretation of bayesian decision tree ensembles
design process for explainable ai user experiences. arXiv for clinical applications. IEEE Transactions on Information
preprint arXiv:2104.03483 . Technology in Biomedicine, 11(3), 312–319.
[68] Lipton Z. C (2018). The mythos of model interpretability: [88] Selvaraju R. R, Cogswell M, Das A, et al. (2017a). Grad-
In machine learning, the concept of interpretability is both cam: Visual explanations from deep networks via gradient-
important and slippery. Queue, 16(3), 31–57. based localization. In Proceedings of the IEEE International
[69] Lundberg S. M and Lee S.-I (2017a). Consistent Conference on Computer Vision, pages 618–626.
feature attribution for tree ensembles. arXiv preprint [89] Selvaraju R. R, Cogswell M, Das A, et al. (2017b). Grad-
arXiv:1706.06060 . cam: Visual explanations from deep networks via gradient-
[70] Lundberg S. M and Lee S.-I (2017b). A unified approach based localization. In Proceedings of the IEEE International
to interpreting model predictions. In Advances in neural Conference on Computer Vision, pages 618–626.
information processing systems, pages 4765–4774. [90] Shrikumar A, Greenside P, and Kundaje A (2017). Learn-
[71] Mehrabi N, Morstatter F, Saxena N, et al. (2019a). A ing important features through propagating activation dif-
survey on bias and fairness in machine learning. arXiv ferences. In International conference on machine learning,
preprint arXiv:1908.09635 . pages 3145–3153. PMLR.
[72] Mehrabi N, Morstatter F, Saxena N, et al. (2019b). A [91] Simonyan K, Vedaldi A, and Zisserman A (2013). Deep in-
survey on bias and fairness in machine learning. arXiv side convolutional networks: Visualising image classification
preprint arXiv:1908.09635 . models and saliency maps. arXiv preprint arXiv:1312.6034 .
[73] Meske C, Bunde E, Schneider J, and Gersch M (2022). [92] Škrlj B, Džeroski S, Lavrač N, and Petkovič M (2020).
Explainable artificial intelligence: objectives, stakeholders, Feature importance estimation with self-attention networks.
and future research opportunities. Information Systems Man- arXiv preprint arXiv:2002.04464 .
agement, 39(1), 53–63. [93] Springenberg J. T, Dosovitskiy A, Brox T, and Riedmiller
[74] Miller T (2018). Explanation in artificial intelligence: M (2014). Striving for simplicity: The all convolutional net.
Insights from the social sciences. Artificial Intelligence. arXiv preprint arXiv:1412.6806 .
[75] Ming Y, Qu H, and Bertini E (2018). RuleMatrix: Vi- [94] Stiglic G, Kocbek P, Fijacko N, et al. (2020). Interpretabil-
sualizing and understanding classifiers with rules. IEEE ity of machine learning based prediction models in healthcare.
transactions on visualization and computer graphics, 25(1), arXiv preprint arXiv:2002.08596 .
342–352. [95] Sundararajan M, Taly A, and Yan Q (2017). Axiomatic
[76] Mitchell M (2019). Fairness. attribution for deep networks. In International conference
[77] Molnar C (2020). Interpretable machine learning. Lulu. on machine learning, pages 3319–3328. PMLR.
com. [96] Tang Z, Chuang K. V, DeCarli C, et al. (2019). Inter-
Karim and Decker et al. 23

pretable classification of alzheimer’s disease pathologies with ence analysis. arXiv preprint arXiv:1702.04595 .
a convolutional neural network pipeline. Nature communica-
tions, 10(1), 1–14.
[97] Tjoa E and Guan C (2020). A survey on explainable
artificial intelligence (xai): Toward medical xai. IEEE trans-
actions on neural networks and learning systems, 32(11),
4793–4813.
[98] Tocchetti A and Brambilla M (2022). The role of human
knowledge in explainable ai. Data, 7(7), 93.
[99] Vaswani A, Shazeer N, Parmar N, et al. (2017). Attention
is all you need. Advances in neural information processing
systems, 30, 5998–6008.
[100] Wachter S, Mittelstadt B, and Russell C (2017). Coun-
terfactual explanations without opening the black box: Au-
tomated decisions and the GDPR. Harv. JL & Tech., 31,
841.
[101] Wang Z, Liu Z, Wei W, and Duan H (2021). Saled:
Saliency prediction with a pithy encoder-decoder architecture
sensing local and global information. Image and Vision
Computing, 109, 104149.
[102] Xian Y, Fu Z, Muthukrishnan S, et al. (2019). Reinforce-
ment knowledge graph reasoning for explainable recommen-
dation. In Proceedings of the 42nd international ACM SIGIR
conference on research and development in information re-
trieval, pages 285–294.
[103] Xu C and Doshi T (2019). Fairness indicators: Scalable
infrastructure for fair ml systems.
[104] Xu J, Kim S, Song M, et al. (2020). Building a pubmed
knowledge graph. Scientific data, 7(1), 1–15.
[105] Xue K, Zhou Y, Ma Z, et al. (2019). Fine-tuning bert
for joint entity and relation extraction in chinese medical
text. In Int. Conf. Bioinformatics and Biomedicine (BIBM),
pages 892–897. IEEE.
[106] Zaidan O, Eisner J, and Piatko C (2007). Using “an-
notator rationales” to improve machine learning for text
categorization. In The conference of the North American
chapter of the association for computational linguistics on
Human language technologies, pages 260–267.
[107] Zednik C (2019). Solving the black box problem: A
normative framework for explainable artificial intelligence.
Philosophy & Technology, pages 1–24.
[108] Zeiler M. D and Fergus R (2014). Visualizing and under-
standing convolutional networks. In European conference on
computer vision, pages 818–833. Springer.
[109] Zhao G, Zhou B, Wang K, et al. (2018). Respond-cam:
Analyzing deep models for 3d imaging data by visualiza-
tions. In International Conference on Medical Image Com-
puting and Computer-Assisted Intervention, pages 485–492.
Springer.
[110] Zhou B, Khosla A, Lapedriza A, et al. (2016). Learning
deep features for discriminative localization. In Proceed-
ings of the IEEE conference on computer vision and pattern
recognition, pages 2921–2929.
[111] Zhou Y, Zhu Y, Ye Q, et al. (2018). Weakly supervised
instance segmentation using class peak response. In Proceed-
ings of the IEEE conference on computer vision and pattern
recognition, pages 3791–3800.
[112] Zhu M, Weng Y, He S, et al. (2022). Reasonchainqa: Text-
based complex question answering with explainable evidence
chains. arXiv preprint arXiv:2210.08763 .
[113] Zintgraf L. M, Cohen T. S, Adel T, and Welling M (2017).
Visualizing deep neural network decisions: Prediction differ-

You might also like