Professional Documents
Culture Documents
Trust in AI-Enabled Decision Support Systems Preliminary Validation of MAST Criteria
Trust in AI-Enabled Decision Support Systems Preliminary Validation of MAST Criteria
net/publication/366434144
CITATIONS READS
4 1,628
10 authors, including:
All content following this page was uploaded by Pouria Salehi on 09 January 2023.
I. Introduction
Artificial intelligence (AI) and machine learning have enabled
advanced information processing capabilities, but their use in high-risk III. Results and D iscussion
decision environments remains limited in part due to a lack o f trust in
these systems. To address this issue, in 2019, the AI Team o f the Public- The high-MAST version o f both systems resulted in higher MAST
Private Analytic Exchange Program, Office o f the Director o f National ratings than the low-MAST versions for 4/9 criteria in Facewise, and
Intelligence and Department o f Homeland Security, adapted 9/9 criteria in READIT. This supports our assumption that systems
Intelligence Community Directive (ICD) 203 to create the Multisource designed according to MAST criteria would be perceived by others as
AI Scorecard Table (MAST). MAST is an evaluation checklist and responsive to the criteria, especially for a text summarization system.
methodology that qualifies nine criteria to inform trust in Al-enabled We also found a positive correlation between MAST and trust items
decision support systems (AI-DSS). These criteria include: sourcing, (r(38) = 0.55, p = 0.000219, moderate strength), MAST and credibility
uncertainty, distinguishing, analysis o f alternatives, customer (r(38) = 0.54, p = 0.000375, moderate strength), and trust and credibility
relevance, logical argumentation, consistency, accuracy, and (r(38) = 0.85, p = 5.20E-12, high strength).
visualization. We hypothesized that applying these criteria in the
design, operation, or documentation o f an AI-DSS would and mitigate
known issues in establishing trust in AI, such as clarifying the potential
for data poisoning, report omissions, and communicating uncertainty.
MAST, however, had not yet been empirically validated. This study
addresses this gap across two AI-DSS to assess the extent to which
MAST criteria relate to trust perceptions.
II. Method
The MAST criteria were used to design two AI-DSS named
Facewise, a face-matching identity verification system, and READIT,
a text summarization system. Two levels o f each system were designed:
high-MAST, which had a set o f rich features that generally ranked high
on each o f the MAST criteria; and low-MAST, which had a minimum
set o f features similar to black-box systems, and generally ranked low
on each o f the MAST criteria. For instance, for Facewise’s uncertainty
criteria, the high-MAST version displayed a confidence level along
with a decision recommendation, whereas the low-MAST version only
displayed the decision recommendation. These system designs were
generated by the study team (authors listed) and through iterative
testing and redesign.
Forty participants were recruited from Prolific, ten for each level o f
each system (2 x 2). After random assignment to one o f the four groups, IV. Conclusion
participants reviewed the system features by watching a short video. MAST criteria can be used to assess trustworthiness and credibility
Participants then evaluated the system using MAST, and responded to o f Al-enabled decision support systems, especially for text-based
questionnaire items on perceived trust and message credibility. summarization machine learning systems. Future work should validate
these findings against trust-related behavioral measures, and
comparison with other AI evaluation tools and methodologies.
This material is based on work supported by the U. S. Department o f Homeland Security under Grant Award
Number 17STQAC00001-05-00. The views and conclusions contained in this document are those o f the
authors and should not be interpreted as representing the official policies, either expressed or implied, o f the
Department o f Homeland Security.
Authorized licensed use limited to: ASU Library. Downloaded on January 09,2023 at 18:42:19 UTC from IEEE Xplore. Restrictions apply.
View publication stats