Generalization Bounds under Heavy-Tailed Losses
Gholamali Aminian
Abstract
The generalization error of a supervised statistical learning algorithm, defined as the difference between the population risk and the empirical risk, quantifies its ability to predict performance on previously unseen data. In this work, we analyze the generalization error under the heavy-tailed assumption on the loss function with respect to the data-generating distribution. Specifically, we derive uniform, information-theoretic, and PAC-Bayesian bounds on the generalization error under the assumption that the $(1+\epsilon)$-th moment of the loss function is bounded for $\epsilon\in(0,1]$. The generalization error is shown to have a convergence rate of $O(n^{-\epsilon/(1+\epsilon)})$ where $n$ is the number of training samples. Furthermore, we apply our results to study the generalization error of the Gibbs posterior and noisy iterative learning algorithms under the heavy-tailed assumption.
Successful Page Load