Large Scale K-Median Clustering for Stable Clustering Instances

Konstantin Voevodski

Keywords: [ Algorithms, Optimization and Computation Methods ] [ Large Scale, Parallel and Distributed ]

[ Abstract ]
Wed 14 Apr 12:45 p.m. PDT — 2:45 p.m. PDT


We study the problem of computing a good k-median clustering in a parallel computing environment. We design an efficient algorithm that gives a constant-factor approximation to the optimal solution for stable clustering instances. The notion of stability that we consider is resilience to perturbations of the distances between the points. Our computational experiments show that our algorithm works well in practice - we are able to find better clusterings than Lloyd’s algorithm and a centralized coreset construction using samples of the same size.

Chat is not available.