AISTATS Poster Generalization Bounds for Label Noise Stochastic Gradient Descent

Poster

Generalization Bounds for Label Noise Stochastic Gradient Descent

Jung Eun Huh · Patrick Rebeschini

MR1 & MR2 - Number 103

[ Abstract ]

Abstract: We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension

d

$d$ . Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with

d

$d$ and with the rate of

n^{- 2 / 3}

$n^{-2/3}$ , where

n

$n$ is the sample size. This rate is better than the best-known rate of

n^{- 1 / 2}

$n^{-1/2}$ established for stochastic gradient Langevin dynamics (SGLD)---which employs parameter-independent Gaussian noise---under similar conditions. Our analysis offers quantitative insights into the effect of label noise.

Chat is not available.