Building A Machine Learning System Resilient To Adversarial Attacks

Building a Machine Learning
System Resilient to Adversarial

Attacks
Machine learning offers tremendous promise but is vulnerable to adversarial attacks. In this
presentation, we will explore how to create a resilient machine learning system.
Project Guide: Dr. Manisha Parlewar
Group Members
Arun Raj K V(B180590EC)
Akthar Azif(B180739EC)
Shamil Shihab(B190775EC)
Rithul Sabi Kumar(B191090EC)

Overview of Adversarial Attacks
1 Definition 2 Examples
Adversarial attacks modify Examples of adversarial

data inputs to produce attack techniques include
incorrect results from FGSM, iFGSM, and MI-FGSM.
machine learning models
3 Impact
Adversarial attacks can result in catastrophic consequences, from

fraudulent financial transactions to fatal accidents.
Understanding FGSM
The Fast Gradient Sign Method (FGSM) is a popular method for generating adversarial
examples in machine learning. It works by adding small perturbations to input data based on
the gradient of the loss function. These perturbations can cause a machine learning model to
misclassify the input, even if it appears unchanged to a human observer. Understanding FGSM
is critical to developing defenses against adversarial attacks.
Understanding iFGSM
The Iterative Fast Gradient Sign Method (iFGSM) is an extension of FGSM that generates
multiple perturbations to input data. It is more effective than FGSM and can produce stronger
adversarial examples. Understanding iFGSM is important for developing robust defenses
against adversarial attacks.
Understanding MI-FGSM
The Momentum Iterative Fast Gradient Sign Method (MI-FGSM) is a state-of-the-art method for
generating adversarial examples in machine learning. It uses a momentum term to
accumulate gradients across iterations, which can lead to even stronger attacks.
Understanding MI-FGSM is crucial for developing effective defenses against adversarial
attacks.
Approaches to Designing a Resilient
Machine Learning System
Defensive distillation Adversarial Training
It involves training the model on a Adversarial training modifies machine

distilled version of the original training learning algorithms to teach them how to
data, which is generated by training recognize and counteract adversarial
another model on the original data. This attacks.It involves adding modified inputs
second model is trained to predict the (adversarial examples) to the training data
output of the first model, rather than the to teach the model to recognize and
original labels. The distilled data is then defend against such attacks.
used to train the target model.
Gradient Masking Ensemble Techniques
Gradient masking is a technique that Ensemble techniques use the output of

obscures gradients, making it more multiple machine learning systems to
challenging for attackers to manipulate improve the overall result, making it more
them to cause damage challenging for attackers to find
weaknesses.
A failed defense: “gradient masking”
1 2 3
Perturb the input System misidentifies What if there were no

during attack the Class gradient?
Why Defensive Distillation
1 Adversarial Training
Increased computational and training costs
Limited generalization
Trade-off between accuracy and robustness
2 Ensemble Techniques
Increased computational and training costs
Overfitting
Difficulty in interpretation
Limited generalization
3 Defensive Distillation
Improved robustness
Lower computational and training costs
Simpler implementation
Reduced overfitting
Training Data and Defensive distillation
1. Train a "teacher" model on the original training data. This model is typically a large,
complex model that is trained to perform well on the training data.
2. Use the teacher model to generate a new set of training data by predicting the outputs of
the original training data. This new data set is referred to as the "distilled" data.
3. Train a "student" model on the distilled data. This model is typically a smaller, simpler
model that is easier and faster to train than the teacher model.
4. Use the student model for inference, i.e., to make predictions on new data.
RESULTS
RESULTS
RESULTS
Conclusion and Future Directions
Challenges Future
1. Robustness is difficult to achieve when 1. Future research must address adversarial
algorithmic decisions depend on attacks in the medical, financial, legal
features of adversarial distributions. and other domains.
2. Designing defenses can require privacy- 2. Developing novel information-theoretic

preserving tests that increase complexity defences against adversarial machine
in AI systems. learning attacks is an important area for
research.
3. Introducing new risks, such as hidden
algorithms exploiting the shortcomings 3. Developing application-specific
of the defenses will threaten to neutralize defenses, making security a constraint
their effectiveness. during AI application and model design
is the need of the hour.
Conclusion
Adversarial attacks pose a serious threat to machine learning systems,
with potentially catastrophic consequences in a variety of applications.
Building resilient machine learning systems that are able to withstand
these attacks is therefore of utmost importance. Defensive distillation is
a promising approach to achieving this goal, offering several
advantages over other methods such as adversarial training and
ensemble techniques. However, there is still much work to be done in
this area, and developing effective defenses against adversarial attacks
remains an active area of research. By continuing to study these attacks
and developing robust and resilient machine learning systems, we can
help ensure the safety and security of the technologies that are
increasingly shaping our world.
References
Papernot, N., McDaniel, P., Sinha, A., Wellman, M. P. (2017). Practical black-box attacks
against machine learning. arXiv preprint arXiv:1602.02697v4.
Hinton, G., Vinyals, O., Dean, J. (2015, March 9). Distilling the Knowledge in a Neural
Network. arXiv.org. https://arxiv.org/abs/1503.02531v1
N. Papernot, P. McDaniel, X. Wu, S. Jha and A. Swami, ”Distillation as a Defense to

Adversarial Perturbations Against Deep Neural Networks,” 2016 IEEE Symposium on
Security and Privacy (SP), San Jose, CA, USA, 2016, pp. 582-597, doi: 10.1109/SP.2016.41.
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of

deep learning in adversarial settings,” in Proceedings of the 1st IEEE European Symposium
on Security and Privacy. IEEE, 2016.
Thank You

Building A Machine Learning System Resilient To Adversarial Attacks

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Building A Machine Learning System Resilient To Adversarial Attacks

Uploaded by

Copyright:

Available Formats

Building a Machine Learning

System Resilient to Adversarial

Project Guide: Dr. Manisha Parlewar

Arun Raj K V(B180590EC)

Rithul Sabi Kumar(B191090EC)

Adversarial attacks modify Examples of adversarial

Adversarial attacks can result in catastrophic consequences, from

It involves training the model on a Adversarial training modifies machine

Gradient Masking Ensemble Techniques

Gradient masking is a technique that Ensemble techniques use the output of

Perturb the input System misidentifies What if there were no

Increased computational and training costs

Trade-off between accuracy and robustness

Lower computational and training costs

2. Designing defenses can require privacy- 2. Developing novel information-theoretic

N. Papernot, P. McDaniel, X. Wu, S. Jha and A. Swami, ”Distillation as a Defense to

N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of

You might also like