Continuum-Armed Bandits: A Function Space Perspective

Shashank Singh

Keywords: [ Learning Theory and Statistics ] [ Decision Processes and Bandits ]


The continuum-armed bandits problem involves optimizing an unknown objective function given an oracle that evaluates the function at a query point. In the most well-studied case, the objective function is assumed to be Lipschitz continuous and minimax rates of simple and cumulative regrets are known under both noiseless and noisy conditions. In this paper, we investigate continuum-armed bandits under more general smoothness conditions, namely Besov smoothness conditions, on the objective function. In both noiseless and noisy conditions, we derive minimax rates under both simple and cumulative regrets. In particular, our results show that minimax rates over objective functions in a Besov space are identical to minimax rates over objective functions in the smallest Holder space into which the Besov space embeds.

Chat is not available.