Adi Shamir - A Simple Explanation for the Mysterious Existence of Adversarial Examples with Small Hamming Distance


The existence of adversarial examples in which tiny changes in the input can fool well trained neural networks has many applications and implications in object recognition, autonomous driving, cyber security, etc. However, it is still far from being understood why such examples exist, and which parameters determine the number of input coordinates one has to change in order to mislead the network. In this talk I will describe a simple mathematical framework which enables us to think about this problem from a fresh perspective, turning the existence of adversarial examples from a baffling phenomenon into a natural consequence of the geometry of $R^n$ with the $L_0$ (Hamming) metric, which can be quantitatively analyzed. An important benefit of our analysis is that it enables us to show that many proposed immunization techniques against adversarial attacks are unlikely to succeed, while proposing other techniques which are based on novel geometric principles.

Seminar Talk
33 Oxford St, MD Room 119. Cambridge, Massachusetts