Skip to yearly menu bar Skip to main content


Poster

Krylov--Bellman boosting: Super-linear policy evaluation in general state spaces

Eric Xia · Martin Wainwright

Auditorium 1 Foyer 51

Abstract:

We present and analyze the Krylov--Bellman Boosting algorithm forpolicy evaluation in general state spaces. It alternates betweenfitting the Bellman residual using non-parametric regression (as inboosting), and estimating the value function via the least-squarestemporal difference (LSTD) procedure applied with a feature set thatgrows adaptively over time. By exploiting the connection to Krylovmethods, we equip this method with two attractive guarantees. First,we provide a general convergence bound that allows for separateestimation errors in residual fitting and LSTD computation.Consistent with our numerical experiments, this bound shows thatconvergence rates depend on the restricted spectral structure, and aretypically super-linear. Second, by combining this meta-result withsample-size dependent guarantees for residual fitting and LTSDcomputation, we obtain concrete statistical guarantees that depend onthe sample size along with the complexity of the function class usedto fit the residuals. We illustrate the behavior of the KBB algorithmfor various types of policy evaluation problems, and typically findlarge reductions in sample complexity relative to the standardapproach of fitted value iteration.

Live content is unavailable. Log in and register to view live content