You are on page 1of 101

SAE G-34/WG-114 Tech Talk Regular Sessions

Yu Huafeng, Boeing

Sept., 23, 2021

Robustness of AI models and


the way to AI certification

maurizio.mongelli, alberto.carlevaro, sara.narteni, vanessa.orani@ieiit.cnr.it

National Research Council of Italy (CNR), Institute of Electronics, Information


Engineering and Telecommunications (IEIIT)

giacomo.gentile@collins.com

Collins Aerospace (applied research & technologies)


1
Index of the presentation
• Damage Propagation Modeling for Aircraft Engine Run-
to-Failure Simulation
– Example of eXplainable and Reliable AI

• Relation with EASA FUG objectives

• Conclusions

2
Example

3
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

Saxena, Abhinav, Kai Goebel. "Turbofan Engine Degradation Simulation Data Set." NASA Ames
Prognostics Data Repository https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-
repository/#turbofan NASA Ames Research Center, Moffett Field, CA. 4
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

Saxena, Abhinav & Goebel, Kai & Simon, Don & Eklund, Neil. (2008). Damage propagation
modeling for aircraft engine run-to-failure simulation. International Conference on Prognostics
and Health Management. https://bit.ly/3h1SWBt. 5
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

6
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

7
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

8
Damage Propagation Modeling for Aircraft
Engine Run-to-Failure Simulation

https://it.mathworks.com/help/predmaint/ug/remaining-useful-life-estimation-using-
convolutional-neural-network.html#mw_rtc_RULEstimationUsingCNNExample_98C35430. 9
Remaining Useful Life (RUL) metric

https://it.mathworks.com/help/predmaint/ug/remaining-useful-life-estimation-using-
convolutional-neural-network.html#mw_rtc_RULEstimationUsingCNNExample_98C35430. 10
Signal monitoring to feature extraction

https://it.mathworks.com/help/predmaint/ug/remaining-useful-life-estimation-using-
convolutional-neural-network.html#mw_rtc_RULEstimationUsingCNNExample_98C35430. 11
eXplainable AI

12
Our approach: eXplainable AI (XAI)
Differing from convolutional neural network
(Matlab example*), we design and implement XAI
(on the www.rulex.ai platform).

*
https://it.mathworks.com/help/predmaint/ug/remaining-useful-life-estimation-using-
convolutional-neural-network.html#mw_rtc_RULEstimationUsingCNNExample_98C35430. 13
Our approach: eXplainable AI (XAI)
Differing from convolutional neural network
(Matlab example*), we design and implement XAI
(on the www.rulex.ai platform).
Then, later, we study robustness to safety of the
approach (Python, Matlab).

*
https://it.mathworks.com/help/predmaint/ug/remaining-useful-life-estimation-using-
convolutional-neural-network.html#mw_rtc_RULEstimationUsingCNNExample_98C35430. 14
eXplainable AI - why
Capital One Pursues ‘Explainable AI’ to Guard Against
The Next Big Disruptive Trend in Business. . .
Bias in Models
Explainable AI
The effort aims to better understand how a machine-
With so many different approaches to machine learning
learning model comes to a logical conclusion.
– neural networks, complex algorithms, probabilistic
graphical models – it’s getting increasingly difficult for Capital One Financial Corp. is researching ways that
humans to figure out how machines are coming to their machine-learning algorithms could explain the rationale
conclusions. behind their answers, which could have far-reaching
impacts in guarding against potential ethical and
regulatory breaches as the firm uses more artificial
intelligence in banking.

Sure, A.I. Is Powerful—But Can We Make It Accountable?


Imagine you apply for insurance with a firm that uses a machine-learning system, instead of a human with an actuarial table, to predict
insurance risk. After crunching your info—age, job, house location and value—the machine decides, nope, no policy for you. You ask
the same question: “Why?”
Nobody can answer, because nobody understands how these systems—neural networks modeled on the human brain—produce their
results.

• EU General Data Protection Regulation (GDPR)


 In Effect May 2018
 Penalties as high as 4% of annual revenue
Artificial Intelligence Is Setting Up the Internet for a Huge Clash With Europe

The GDPR restricts what the EU calls “automated individual decision-making”, And for the world’s biggest tech
companies, that’s a potential problem. “Automated individual decision-making” is what neural networks do.
“They’re talking about machine learning,” says Bryce Goodman, a philosophy and social science researcher at
Oxford University.

The regulations prohibit any automated decision that “significantly affects” EU citizens. This includes techniques
that evaluate a person’s “performance at work, economic situation, health, personal preferences, interests,
reliability, behavior, location, or movements.” At the same time, the legislation provides what Goodman calls a
“right to explanation.” In other words, the rules give EU citizens the option of reviewing how a particular service
made a particular algorithmic decision.
EASA XAI

16
Feature extraction
The window moves over time (as in the Mathworks
example). For every time position of the window, the
following features are computed: mean, variance,
skewness and kurtosis (of every signal over the
window).

17
Features and ML problem
The window moves over time (as in the Mathworks
example). For every time position of the window, the
following features are computed: mean, variance,
skewness and kurtosis (of every signal over the
window).
The classification problem consists of mapping each
features vector sample into the corresponding RUL
class. The RUL class is as follows: RUL > 150 'healthy',
RUL in [50, 150] 'critical', 'fault' otherwise.

18
Features and ML problem
The window moves over time (as in the Mathworks
example). For every time position of the window, the
following features are computed: mean, variance,
skewness and kurtosis (of every signal over the
window).
The classification problem consists of mapping each
features vector sample into the corresponding RUL
class. The RUL class is as follows: RUL > 150 'healthy',
RUL in [50, 150] 'critical', 'fault' otherwise.
(The formulation is independent to the ML algorithm applied,
XAI or not).
19
Database (db)

20
RUL in the db

21
mean of T24 over time
with respect to the classes

22
mean of T24 over time
with respect to the classes

Aim of the analysis:


XAI finds the ranges of the variables to discriminate among the classes
23
Rules (after training)

24
Rule viewer

25
Confusion matrix (test set)

26
Confusion matrix (test set)

27
Feature ranking (e.g., for healthy class)

28
Value ranking
(e.g., of the most meaningful feature)

29
EASA FUG4L1 objectives: XAI

30
EASA FUG4L1 objectives: XAI

31
EASA FUG4L1 objectives: XAI,
human interaction

32
EASA FUG4L1 objectives: XAI,
questionable, explanation is inherent to the
classification problem

33
EASA FUG4L1 objectives: XAI,
confidence of the rules…

34
EASA FUG4L1 objectives: XAI,
… and rule distance

S. Narteni, M. Ferretti, V. Rampa, M. Mongelli, Bag-of-Words Similarity in eXplainable AI,


submitted to ACM K-CAP 2021, https://bit.ly/38DQigz.
35
Reliable AI

36
Reliable AI
The current state of Tesla Autopilot, August 2019

 Trustworthy AI
 Trustworthiness is a prerequisite for people and societies to develop, deploy and use AI
systems. Without AI systems – and the human beings behind them – being demonstrably
worthy of trust, unwanted consequences may ensue and their uptake might be hindered,
preventing the realisation of the potentially vast social and economic
 https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

 Gruppo ISO/IEC JTC 1/SC 42:


 provide guidance to JTC 1, IEC, and ISO committees developing Artificial Intelligence
applications
 https://jtc1info.org/jtc1-press-committee-info-about-jtc-1-sc-42
Reliable AI: our definition
1) Safety envelope: the feature sub-space in
which the ML model performs with zero
(statistical) error; e.g., when ML predicts no
collision (e.g., in a smart mobility scenario),
there is no collision in reality.

Vehicle platooning/aircraft collision avoidance 38


Reliable AI: our definition
1) Safety envelope: the feature sub-space in
which the ML model performs with zero
(statistical) error; e.g., when ML predicts no
collision (e.g., in a smart mobility scenario),
there is no collision in reality.
2) Find the largest safety envelope (e.g.,
minimum distance, maximum speed).

Vehicle platooning/aircraft collision avoidance 39


EASA FUG4L1 objectives:
Trustworthiness Analysis

40
Techniques for reliable AI

41
1) Safety region from XAI
2) More complex safety regions

42
Safety region from XAI

43
Safety region from XAI

From Feature and Value Ranking:


1. Reliability from Outside
2. Reliability from Inside

From LLM trained with 0% maximum error:


Joining of rules + thresholds optimization
LLM with 0% error
Select the Perturb the most
Train the Join with stringent
first m
LLM with logical OR: threshold for the
rules with
0% error new
highest first N most
predictor relevant features
coverage

Optimization
problem: reduce
the selected feature New «safety
intervals until the region» with more
error (FNR) reaches complex shape than
0 with maximum hyper-rectangle
coverage (TNR)
possible
Application example: collision avoidance
in vehicle platooning
Reliability from Outside Reliability from Inside
N=2 most relevant intervals for collision (y=1) : N=2 most relevant intervals for safe (y=0)
PER > 0.43, F0 ≤-7.50 x 103 class:
PER ≤ 0.43, F0 >-3.50 x 103
Obtained «safety region» with FNR=0, TNR=0.34 Obtained «safety region» with FNR=0 and
TNR=0.13
LLM with 0% error
Joining of the 4 rules with highest covering:

• No perturbation: FNR=0.05, TNR=0.55


• Perturbation of N=2 most important
features: v(0), PER  FNR=0.02, TNR=0.45
(suboptimal solution)
More complex safety regions

50
Support Vector Data Descrtiption
Classic SVDD Negative
SVDD

Unsupervised Supervised
ML ML

• Anomaly • Classifica
Detection tion
• Novelty 𝒙𝑖 Target Class
Detection
• SVM 1-class 𝒙𝑙 Negative Class

𝑁1 𝑁2
𝑚𝑖𝑛 𝐹 𝑅2 , 𝒂, 𝜉𝒊 , 𝜉𝒍 = 𝑅2 + 𝐶1 𝜉𝑖 + 𝐶2 𝜉𝑙
𝑖=1 𝑙=1
2
𝑠. 𝑡. 𝒙𝑖 − 𝒂 ≤ 𝑅2 + 𝜉𝑖 , 𝜉𝑖 ≥ 0 ∀𝑖
2
𝒙𝑙 − 𝒂 ≥ 𝑅2 − 𝜉𝑙 , 𝜉𝑙 ≥ 0 ∀𝑙
Goal 𝐹𝑁𝑅 ≅ 0

𝐹𝑁𝑅 = 0.95 𝐹𝑁𝑅 = 0.07 𝐹𝑁𝑅 = 0.97 𝐹𝑁𝑅 = 0.05


Application: Vehicle
Platooning
collision (negative
class)
Dynamical
System
non-collision
(target class)

𝑇𝑃 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
RUL example

54
RUL example
Similar safety envelopes are found for the healthy
class through sensitivity analysis on the most
ranked features.

55
RUL example
Similar safety envelopes are found for the healthy
class through sensitivity analysis on the most
ranked features.
The value ranking has shown a predominance of
the htBleed variable for the healthy class.

56
Driver for certification

57
Interaction between
data analyst and safety engineer

The explainable safety envelope is discussed with the


safety engineer:

58
Interaction between
data analyst and safety engineer

The explainable safety envelope is discussed with the


safety engineer:
• is the (propabilistic) safety guarantee acceptable?
– probability of collision 0%
– reliability of object detection in video analytics

59
Interaction between
data analyst and safety engineer

The explainable safety envelope is discussed with the


safety engineer:
• is the (propabilistic) safety guarantee acceptable?
– probability of collision 0%
– reliability of object detection in video analytics

• is the amplitude of the safety envelope large enough?


– maximum acceptable speed, minimum acceptable distance

60
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• Provable perfomance: via formal logic

61
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• Provable perfomance: via formal logic
– is it still probabilistic?

C.-H. Cheng, R. Yan, " Continuous Safety Verification of Neural Networks,


https://arxiv.org/abs/2010.05689.

M. Mongelli, M. Muselli, E. Ferrari, A. Scorzoni "Accellerating PRISM Validation of Vehicle


62
Platooning through Machine Learning," ICSRS 2019, https://bit.ly/3o0bZ3m.
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• Provable perfomance: testing AI software

63
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• Provable perfomance: testing AI software
– large simulations… (open scenario,
https://www.asam.net/conferences-events/detail/webinar-asam-openlabel/ )

– … & testing most critical scenarios on the field


• Safety in Autonomous Driving: Can Tools Offer Guarantees?
https://bit.ly/2W57qsR

64
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• Provable perfomance: testing AI software

Guidance on the Assurance of Machine Learning in Autonomous Systems (AMLAS),


65
https://arxiv.org/abs/2102.01564.
Interaction between
data analyst and safety engineer

The safety engineer is not satisfied:


• imputable to AI?
– new sensors should be provided in order to enhance the
sensing capability of the system and increase the quality of the
AI inference

66
Interaction between
data analyst and safety engineer

The safety engineer should consider:


• The sensor devices of the features profiling the safety
envelope should be «safe» (redundant)
• https://www.youtube.com/watch?v=JqFmLrwOgC0&t=54s (automotive
example from www.safecop.eu)

67
Interaction between
data analyst and safety engineer

The safety engineer should consider:


• The sensor devices of the features profiling the safety
envelope should be «safe» (redundant)
– a failsafe fallback should be triggered in case a fault applies to
any of those sensors

68
Conclusions

69
Conclusions
Certification of machine learning models is one of
the main goals of AI in the near future.
eXplainable AI may drive the certification process
through Reliable AI.
Challenges: scalability, formal logic, interaction
with the expert of the field, … .

70
References

71
References
M. Mongelli, E. Ferrari , M. Muselli, A. Fermi, "Performance validation of vehicle
platooning via intelligible analytics," IET Cyber-Physical Systems: Theory & Applications,
19 Oct. 2018, DOI: 10.1049/iet-cps.2018.5055.

S. Narteni, M. Ferretti, V. Orani, I. Vaccari, E. Cambiaso, M. Mongelli, "From Explainable to


Reliable Artificial Intelligence," International IFIP Cross Domain (CD) Conference for
Machine Learning & Knowledge Extraction (MAKE), CD-MAKE 2021, in conjunction with
the 16th International Conference on Availability, Reliability and Security ARES 2021,
August 17 – 20, 2021. Video: https://bit.ly/38CPVme

A. Carlevaro. M. Mongelli, "Reliable AI through SVDD and rule extraction," International


IFIP Cross Domain (CD) Conference for Machine Learning & Knowledge Extraction
(MAKE), CD-MAKE 2021, in conjunction with the 16th International Conference on
Availability, Reliability and Security ARES 2021, August 17 – 20, 2021. Video:
https://bit.ly/38EWgOg.

M. Mongelli and V. Orani, 'Stability Certification of Dynamical Systems: Lyapunov Logic


Learning Machine,' International Conference on Applied Soft computing and
Communication Networks (ACN'20), October 14-17, 2020, Chennai, India, in Lecture
Notes in Networks and Systems (LNNS), Springer, Singapore. 72
73
Back up slides

74
Titolo
Bla bla

75
Reliable AI: our approach
• Modelling prediction of collision in
vehicle platooning
Reliable AI: our approach
• Modelling prediction of collision in
vehicle platooning

Model
inferred by
data

«fix» configurations leading to a collision


Example: «Collision will happen since
initial distance <= 24 and
initial speed > 41.

If you change initial distance to 25


collision will not happen»
Which
parameters
mostly
influence a
collision?
www.safecop.eu
Reliable AI: our approach https://valu3s.eu/

• Modelling prediction of collision in


vehicle platooning

Model
inferred by
data

«fix» configurations leading to a collision


Example: «Collision will happen since
M. Mongelli, M. Muselli, E. Ferrari, A. Scorzoni "Accellerating PRISM Validation of initial distance <= 24 and
Vehicle Platooning through Machine Learning," 2019 4th International Conference initial speed > 41.
on System Reliability and Safety (ICSRS 2019), Rome, Italy, 20-22 Nov. 2019.
If you change initial distance to 25
M. Mongelli, "Design of countermeasure to packet falsification in vehicle
collision will not happen»
platooning by explainable artificial intelligence," Comp. Commun., in press.
Which
M. Mongelli and V. Orani, 'Stability Certification of Dynamical Systems: Lyapunov
parameters
Logic Learning Machine,' International Conference on Applied Soft computing and
Communication Networks (ACN'20), October 14-17, 2020, Chennai, India, in mostly
Lecture Notes in Networks and Systems (LNNS), Springer, Singapore. influence a
collision?
Logic Learning Machine

79
A.1)
Cut-offs applied to continuos variables.
Discretization
Optimal placement of the cut-offs.
A.1)
Cut-offs applied to continuos variables.
Discretization
Optimal placement of the cut-offs.

A simple bidimensional problem: the points of the two classes are represented by circles
and crosses, respectively.

E. Ferrari and M. Muselli. Maximizing pattern separation in discretizing continuous features for
classication purposes. In The 2010 International Joint Conference on Neural Networks (IJCNN), pages
1{8, July 2010. doi: 10.1109/IJCNN.2010.5596838.
Discretization &
A) Latticization Class x1 Bin(x1) x2 Bin(x2) Final string
- Inverse only-one coding B 8 011 0 01 01101
[x1,…,xn] [0,1]n A 12 101 1 10 10110
[0,1]n+k A 22 110 1 10 11010
[y1,…,yk] [0,1]k
Latticization: for continous variables, as x1, binary values correspond to cut-
offs between adjacent values (e.g.: 10 and 20)
Discretization &
A) Latticization Class x1 Bin(x1) x2 Bin(x2) Final string
- Inverse only-one coding B 8 011 0 01 01101
[x1,…,xn] [0,1]n A 12 101 1 10 10110
[0,1]n+k A 22 110 1 10 11010
[y1,…,yk] [0,1]k
For continous variables, as x1, binary values correspond to cut-offs between
adjacent values (e.g.: 10 and 20)

111

B) Shadow Clustering (SC)


011 (B) 101 (A) 110 (A) Implicant based on x
1
for the identification
- Implicants identification of class A.
001 010 100 No tuning of any
- Boolean rules extraction parameter in SC!
000
Discretization &
A) Latticization Class x1 Bin(x1) x2 Bin(x2) Final string
- Inverse only-one coding B 8 011 0 01 01101
[x1,…,xn] [0,1]n A 12 101 1 10 10110
[0,1]n+k A 22 110 1 10 11010
[y1,…,yk] [0,1]k
For continous variables, as x1, binary values correspond to cut-offs between
adjacent values (e.g.: 10 and 20)

111

B) Shadow Clustering (SC)


011 (B) 101 (A) 110 (A) Implicant based on x
1
for the identification
- Implicants identification of class A.
001 010 100 No tuning of any
- Boolean rules extraction parameter in SC!
000

Condition on
Conversion into variable x1
C) Original values Binary string
Intelligible Rules 8 (B) 011 If x1 > 10
12 (A) 101
If X > valx and Y ≤ valy 22 (A) 110 then Class = A
then Class = A
Cut-offs identified during latticization and suitable for classification purposes
according to SC are recovered and used to obtain the conditions inside the rules
Cybersecurity of vehicle platooning

85
Packet falsification
Packet falsification consists in manipulation of the acceleration field of IEEE 802.11p, i.e.,
sending unreal indications to follower vehicle (whenever vehicle decelerates, the malicious
packet is as if vehicle accelerates and vice versa).

S. Ucar, S. C. Ergen, and O. Ozkasap, “Security vulnerabilities of ieee 802.11p and visible light communication 86
based platoon,” in 2016 IEEE Vehicular Networking Conference (VNC), Dec 2016, pp. 1–4.
Packet falsification
Packet falsification consists in manipulation of the acceleration field of IEEE 802.11p, i.e.,
sending unreal indications to follower vehicle (whenever vehicle decelerates, the malicious
packet is as if vehicle accelerates and vice versa).

Attack at t’=2 s. Duration of the attack D=3 s. 87


Approaching the problem
We approach the problem through the above methodology.

88
Intuition
Let’s have a look at integrals of differences of speeds and distances

89
New features
In formulas…

90
Temporal dynamics into ML
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

91
Temporal dynamics into ML
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

92
Temporal dynamics into ML
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

93
Temporal dynamics into ML: results
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

94
Temporal dynamics into ML: results
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

95
Temporal dynamics into ML: results
Does machine learning (ML) help us in synthesizing temporal dynamics into detection?

Feature ranking:

Complex rules on integrals, still preserving reliable prediction, if Isys is not used.

96
Fres: countermeasure after detection

K=2, Fres=-500.
97
Need minimum |Fres| while FNR=0
Objective: safety regions with FNR=0%.

Fres optimal thresholds are found for different F0 intervals => F0 should be known to
calibrate the response to the attack:

98
Fres complex configuration
Objective: safety regions with FNR=0%.

Fres optimal thresholds are found for different F0 intervals => F0 should be known to
calibrate the response to the attack:

99
Fres complex configuration
Objective: safety regions with FNR=0%.

Fres optimal thresholds are found for different F0 intervals => F0 should be known to
calibrate the response to the attack:

The worst case is actually impractical as it leads to platoons working at low speed and
large distances. This is however not surprising as it is a platoon able to resist to attack
under extreme braking conditions.

100
THANK YOU:… Q&A

All Rights Reserved © Rulex, Inc. 2015


Titolo
Bla bla

102
Logic Learning Machines vs Decision Trees

Logic Learning
Decision Trees
Machine
Training is fast and parallelizable Training is fast but not efficiently parallelizable

Model rules are disjoint but conditions are strictly


Models rules are independent from each other
dependent

Relevance measures for variables and values are Relevance measures for variables and values are
automatically generated not directly available

Accuracy is generally higher Accuracy is generally poorer

Specificity and sensitivity can be controlled Specificity and sensitivity cannot be controlled

Usually models are less complex with simpler Usually models are more complex with longer
rules rules

You might also like