Skip to yearly menu bar Skip to main content


Poster

Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Efe Mert Karagözlü · Yaşar Cahit Yıldırım · Çağın Ararat · Cem Tekin

Multipurpose Room 1 - Number 27

Abstract: We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of $(\epsilon, \delta)$-PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.

Live content is unavailable. Log in and register to view live content