Bandit-based Maximum Inner Product Search with Data-Dependent Confidence Intervals
Yoichi Sasaki · Yuzuru Okajima
Abstract
Maximum inner product search is a fundamental problem in recommender systems, information retrieval, and machine learning. Recently proposed bandit-based approaches have achieved high scalability with respect to dimensionality and offer favorable precision-speedup trade-offs. However, the lengths of their confidence intervals are determined independently of the actual reward distributions, which can lead to search inefficiency in practice. In this paper, we propose a data-dependent bandit-based algorithm in which the lengths of the confidence intervals are adaptively adjusted based on observed samples. Theoretical analysis demonstrates that our algorithm guarantees $\delta$-correctness for a broad class of distributions, including all log-concave continuous distributions, and that the sample complexity can be reduced adaptively according to individual reward distributions. In experiments, our approach outperformed existing algorithms on both synthetic and real-world datasets.
Successful Page Load