vqSGD: Vector Quantized Stochastic Gradient Descent

Venkata Gandikota · Daniel Kane · Raj Kumar Maity · Arya Mazumdar

Keywords: [ Algorithms, Optimization and Computation Methods ] [ Large Scale, Parallel and Distributed ]

[ Abstract ]
Tue 13 Apr 6:30 p.m. PDT — 8:30 p.m. PDT

Abstract: In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: $\Theta(\frac{d}{R^2})$ bits are necessary and sufficient (up to an additive $O(\log d)$ term) to describe an unbiased estimator $\hat{g}(g)$ for any $g$ in the $d$-dimensional unit sphere, under the constraint that $\|\hat{g}(g)\|_2\le R$ almost surely. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a $d$-dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require only $o(d)$ bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using the properties of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers some automatic privacy guarantees.

Chat is not available.