Skip to yearly menu bar Skip to main content


Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Runzhe Wan · Lin Ge · Rui Song

Auditorium 1 Foyer 83


Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a wide class of structured bandit problems where the parameter space can be factorized to item-level, which covers many popular tasks. Compared with existing approaches, the proposed solution is both scalable to large systems and robust by utilizing a more flexible model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Theoretical analysis and extensive numerical results both support the usefulness of the proposed method.

Live content is unavailable. Log in and register to view live content