Adversarial Dictionary Learning
Submitted, 2022
Abstract
To bridge the gap between specific and universal attacks on deep classification networks, the present work frames the learning of multiple adversarial attacks as linear combinations of atoms from a dictionary of universal attacks. In order to learn such adversarial dictionary, a non-convex proximal splitting framework, termed as Adversarial Dictionary Learning (ADiL) is proposed. Numerical experiments evidence that the posteriori study of the dictionary atoms unveils the most common patterns to attack the classifier which, in turn, can be used to craft adversarial perturbations to new examples achieving great transferability on different deep network architectures.