Poster
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
Eric Moulines
[
Abstract
]
Abstract:
In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm (), a novel extension of the algorithm (Azar et al., 2017) tailored for the federated learning framework. We prove that the regret of scales as , with a small additional term due to heterogeneity, where is the number of states, is the number of actions, is the episode length, is the number of agents, and is the number of episodes. Notably, in the single-agent setting, this upper bound matches the minimax lower bound up to polylogarithmic factors, while in the multi-agent scenario, has linear speed-up. To conduct our analysis, we introduce a new measure of heterogeneity, which may hold independent theoretical interest. Furthermore, we show that, unlike existing federated reinforcement learning approaches, 's communication complexity only marginally increases with the number of agents.
Live content is unavailable. Log in and register to view live content