Send email Copy Email Address
2019

A Perturbation Analysis of Input Transformations for Adversarial Attacks

Summary

The existence of adversarial examples, or intentional mis-predictions constructed from small changes to correctly predicted examples, is one of the most significant challenges in neural network research today. Ironically, many new defenses are based on a simple observation - the adversarial inputs themselves are not robust and small perturbations to the attacking input often recover the desired prediction. While the intuition is somewhat clear, a detailed understanding of this phenomenon is missing from the research literature. This paper presents a comprehensive experimental analysis of when and why perturbation defenses work and potential mechanisms that could explain their effectiveness (or ineffectiveness) in different settings.

Conference Paper

International Conference on Learning Representations (ICLR)

Date published

2019

Date last modified

2024-10-14