Generalization Bounds for Heavy-Tailed Losses
Gholamali Aminian
Abstract
The generalization error of a supervised statistical learning algorithm --- defined as the difference between the population risk and the empirical risk --- quantifies its ability to predict performance on previously unseen data. In this work, we analyze the generalization error under the heavy-tailed assumption on the loss function with respect to the data-generating distribution. Specifically, we derive uniform, information-theoretic, and PAC-Bayesian bounds on the generalization error under the assumption that the $(1+\epsilon)$-th moment of the loss function is bounded for $\epsilon\in(0,1]$. The generalization error is shown to have a convergence rate of $O(n^{-\epsilon/(1+\epsilon)})$ where $n$ is the number of training samples. Furthermore, we apply our results to study the generalization error of the Gibbs posterior and noisy iterative learning algorithms under the heavy-tailed assumption.
Successful Page Load