AISTATS Poster Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Poster

Learning the Pareto Set Under Incomplete Preferences: Pure Exploration in Vector Bandits

Efe Mert Karagözlü · Yaşar Cahit Yıldırım · Çağın Ararat · Cem Tekin

MR1 & MR2 - Number 27

[ Abstract ]

[ Poster]

Abstract: We study pure exploration in bandit problems with vector-valued rewards, where the goal is to (approximately) identify the Pareto set of arms given incomplete preferences induced by a polyhedral convex cone. We address the open problem of designing sample-efficient learning algorithms for such problems. We propose Pareto Vector Bandits (PaVeBa), an adaptive elimination algorithm that nearly matches the gap-dependent and worst-case lower bounds on the sample complexity of

(ϵ, δ)

$(\epsilon, \delta)$ -PAC Pareto set identification. Finally, we provide an in-depth numerical investigation of PaVeBa and its heuristic variants by comparing them with the state-of-the-art multi-objective and vector optimization algorithms on several real-world datasets with conflicting objectives.

Chat is not available.