Track: Oral Session 7: Robust Learning

A Robust Kernel Statistical Test of Invariance: Detecting Subtle Asymmetries

Ashkan Soleymani · Behrooz Tahmasebi · Stefanie Jegelka · Patrick Jaillet

While invariances naturally arise in almost any type of real-world data, no efficient and robust test exists for detecting them in observational data under arbitrarily given group actions. We tackle this problem by studying measures of invariance that can capture even negligible underlying patterns. Our first contribution is to show that, while detecting subtle asymmetries is computationally intractable, a randomized method can be used to robustly estimate closeness measures to invariance within constant factors. This provides a general framework for robust statistical tests of invariance. Despite the extensive and well-established literature, our methodology, to the best of our knowledge, is the first to provide statistical tests for general group invariances with finite-sample guarantees on Type II errors. In addition, we focus on kernel methods and propose deterministic algorithms for robust testing with respect to both finite and infinite groups, accompanied by a rigorous analysis of their convergence rates and sample complexity. Finally, we revisit the general framework in the specific case of kernel methods, showing that recent closeness measures to invariance, defined via group averaging, are provably robust, leading to powerful randomized algorithms.

Certifiably Quantisation-Robust training and inference of Neural Networks

Hue Dang · Matthew Wicker · Goetz Botterweck · Andrea Patane

We tackle the problem of computing guarantees for the robustness of neural networks against quantisation of their inputs, parameters and activation values. In particular, we pose the problem of bounding the worst-case discrepancy between the original neural network and all possible quantised ones parametrised by a given maximum quantisation diameter $\epsilon > 0$ over a finite dataset. To achieve this, we first reformulate the problem in terms of bilinear optimisation, which can be solved for provable bounds on the robustness guarantee. We then show how a quick scheme based on interval bound propagation can be developed and implemented during training so to allow for the learning of neural networks robust against a continuous family of quantisation techniques. We evaluated our methodology on a variety of architectures on datasets such as MNIST, F-MNIST and CIFAR10. We demonstrate how non-trivial bounds on guaranteed accuracy can be obtained on several architectures and how quantisation robustness can be significantly improved through robust training.

Learning from biased positive-unlabeled data via threshold calibration

Paweł Teisseyre · Timo Martens · Jessa Bekker · Jesse Davis

Learning from positive and unlabeled data (PU learning) aims to train a binary classification model when only positive and unlabeled examples are available. Typically, learners assume that there is a labeling mechanism that determines which positive labels are observed. A particularly challenging setting arises when the observed positive labels are a biased sample from the positive distribution. Current approaches either require estimating the propensity scores, which are the instance-specific probabilities that a positive example's label will be observed, or make overly restricting assumptions about the labeling mechanism. We make a novel assumption about the labeling mechanism which we show is more general than several commonly used existing ones. Moreover, the combination of our novel assumption and theoretical results from robust statistics can simplify the process of learning from biased PU data. Empirically, our approach offers superior predictive and run time performance compared to the state-of-the-art methods.

Robust Kernel Hypothesis Testing under Data Corruption

Antonin Schrab · Ilmun Kim

We propose a general method for constructing robust permutation tests under data corruption. The proposed tests effectively control the non-asymptotic type I error under data corruption, and we prove their consistency in power under minimal conditions. This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks. For the two-sample and independence settings, we show that our kernel robust tests are minimax optimal, in the sense that they are guaranteed to be non-asymptotically powerful against alternatives uniformly separated from the null in the kernel MMD and HSIC metrics at some optimal rate (tight with matching lower bound). We point out that existing differentially private tests can be adapted to be robust to data corruption, and we demonstrate in experiments that our proposed tests achieve much higher power than these private tests. Finally, we provide publicly available implementations and empirically illustrate the practicality of our robust tests.

Statistical Learning of Distributionally Robust Stochastic Control in Continuous State Spaces

Shengbo Wang · NIAN SI · Jose Blanchet · Zhengyuan Zhou

We explore the control of stochastic systems with potentially continuous state and action spaces, characterized by the state dynamics $X_{t+1} = f(X_t, A_t, W_t)$. Here, $X$, $A$, and $W$ represent the state, action, and exogenous random noise processes, respectively, with $f$ denoting a known function that describes state transitions. Traditionally, the noise process $\set{W_t, t \geq 0}$ is assumed to be independent and identically distributed, with a distribution that is either fully known or can be consistently estimated. However, the occurrence of distributional shifts, typical in engineering settings, necessitates the consideration of the robustness of the policy. This paper introduces a distributionally robust stochastic control paradigm that accommodates possibly adaptive adversarial perturbation to the noise distribution within a prescribed ambiguity set. We examine two adversary models: current-action-aware and current-action-unaware, leading to different dynamic programming equations. Furthermore, we characterize the optimal finite sample minimax rates for achieving uniform learning of the robust value function across continuum states under both adversary types, considering ambiguity sets defined by $f_k$-divergence and Wasserstein distance. Finally, we demonstrate the applicability of our framework across various real-world settings.