Skip to yearly menu bar Skip to main content


Learning Fair Division from Bandit Feedback

Hakuei Yamada · Junpei Komiyama · Kenshi Abe · Atsushi Iwasaki

MR1 & MR2 - Number 21
[ ] [ Project Page ]
Fri 3 May 8 a.m. PDT — 8:30 a.m. PDT


This work addresses learning online fair division under uncertainty, where a central planner sequentially allocates items without precise knowledge of agents’ values or utilities. Departing from conventional online algorithms, the planner here relies on noisy, estimated values obtained after allocating items. We introduce wrapper algorithms utilizing dual averaging, enabling gradual learning of both the type distribution of arriving items and agents’ values through bandit feedback. This approach enables the algorithms to asymptotically achieve optimal Nash social welfare in linear Fisher markets with agents having additive utilities. We also empirically verify the performance of the proposed algorithms across synthetic and empirical datasets.

Live content is unavailable. Log in and register to view live content