Poster
Large Scale K-Median Clustering for Stable Clustering Instances
Konstantin Voevodski
Keywords: [ Algorithms, Optimization and Computation Methods ] [ Large Scale, Parallel and Distributed ]
We study the problem of computing a good k-median clustering in a parallel computing environment. We design an efficient algorithm that gives a constant-factor approximation to the optimal solution for stable clustering instances. The notion of stability that we consider is resilience to perturbations of the distances between the points. Our computational experiments show that our algorithm works well in practice - we are able to find better clusterings than Lloyd’s algorithm and a centralized coreset construction using samples of the same size.