Abstract:
The out-of-sample error (OO) is the main quantity of interest in risk estimation and model selection. Leave-one-out cross validation (LO) offers a (nearly) distribution-free yet computationally demanding approach to estimate OO. Recent theoretical work showed that approximate leave-one-out cross validation (ALO) is an efficient estimate of LO (and OO) for generalized linear models with differentiable regularizers. For problems involving non-differentiable regularizers, despite significant empirical evidence, the theoretical understanding of ALO's error remains unknown. In this paper, we present a novel theory for a wide class of problems in the generalized linear model family with non-differentiable regularizers. We bound the error |ALO−LO| in terms of intuitive metrics such as the size of leave-i-out perturbations in active sets, sample size n, number of features p and signal-to-noise ratio (SNR). As a consequence, for the elastic-net problem, we show that |ALO−LO|p→∞→0 while n/p and SNR remain bounded.
Chat is not available.