Provable Adversarial Robustness for Fractional Lp Threat Models

Alexander Levine · Soheil Feizi

[ Abstract ]
Wed 30 Mar 8:30 a.m. PDT — 10 a.m. PDT

Abstract: In recent years, researchers have extensively studied adversarial robustness in a variety of threat models, including $l_0, l_1, l_2$, and $l_{\infty}$-norm bounded adversarial attacks. However, attacks bounded by fractional $l_p$-“norms” (quasi-norms defined by the $l_p$ distance with $0 < p < 1$) have yet to be thoroughly considered. We proactively propose a defense with several desirable properties: it provides provable (certified) robustness, scales to ImageNet, and yields deterministic (rather than high-probability) certified guarantees when applied to quantized data (e.g., images). Our technique for fractional $l_p$ robustness constructs expressive, deep classifiers that are globally Lipschitz with respect to the $l_p^p$ metric, for any $0 < p < 1$. However, our method is even more general: we can construct classifiers which are globally Lipschitz with respect to any metric defined as the sum of concave functions of components. Our approach builds on a recent work, Levine and Feizi (2021), which provides a provable defense against $l_1$ attacks. However, we demonstrate that our proposed guarantees are highly non-vacuous, compared to the trivial solution of using (Levine and Feizi, 2021) directly and applying norm inequalities.

Code is available at

Chat is not available.