Poster
Provable Adversarial Robustness for Fractional Lp Threat Models
Alexander Levine · Soheil Feizi
Abstract:
In recent years, researchers have extensively studied adversarial robustness in a variety of threat models, including $l_0, l_1, l_2$, and $l_{\infty}$-norm bounded adversarial attacks. However, attacks bounded by fractional $l_p$-“norms” (quasi-norms defined by the $l_p$ distance with $0 < p < 1$) have yet to be thoroughly considered. We proactively propose a defense with several desirable properties: it provides provable (certified) robustness, scales to ImageNet, and yields deterministic (rather than high-probability) certified guarantees when applied to quantized data (e.g., images). Our technique for fractional $l_p$ robustness constructs expressive, deep classifiers that are globally Lipschitz with respect to the $l_p^p$ metric, for any $0 < p < 1$. However, our method is even more general: we can construct classifiers which are globally Lipschitz with respect to any metric defined as the sum of concave functions of components. Our approach builds on a recent work, Levine and Feizi (2021), which provides a provable defense against $l_1$ attacks. However, we demonstrate that our proposed guarantees are highly non-vacuous, compared to the trivial solution of using (Levine and Feizi, 2021) directly and applying norm inequalities.
Code is available at https://github.com/alevine0/fractionalLpRobustness.
Chat is not available.