Keywords: [ Learning Theory and Statistics ] [ Decision Processes and Bandits ]

Abstract:
We consider the problem of Bayesian Optimization (BO) in which the goal is to design an adaptive querying strategy to optimize a function $f:[0,1]^d\mapsto \reals$. The function is assumed to be drawn from a Gaussian Process,
and can only be accessed through noisy oracle queries.
The most commonly used oracle in BO literature is the noisy Zeroth-Order-Oracle~(ZOO) which returns noise-corrupted function value $y = f(x) + \eta$ at any point $x \in \domain$ queried by the agent. A less studied oracle in BO is the First-Order-Oracle~(FOO) which also returns noisy gradient value at the queried point.
In this paper we consider the fundamental question of quantifying the possible improvement in regret that can be achieved under FOO access as compared to the case in which only ZOO access is available.
Under some regularity assumptions on $K$, we first show that the expected cumulative regret with ZOO of any algorithm must satisfy a lower bound of $\Omega(\sqrt{2^d n})$, where $n$ is the query budget. This lower bound captures the appropriate scaling of the regret on both dimension $d$ and budget $n$, and relies on a novel reduction from BO to a multi-armed bandit~(MAB) problem. We then propose a two-phase algorithm which, with some additional prior knowledge, achieves a vastly improved $\mc{O}\lp d (\log n)^2 \rp$ regret when given access to a FOO.
Together, these two results highlight the significant value of incorporating gradient information in BO algorithms.

Chat is not available.