Want to cluster a point cloud? Relax, no need to round!

Pranjal, Moses, Ravishankar, Soledad, Rachel, and myself just uploaded our new paper “Relax, no need to round: Integrality of clustering formulation” to the arxiv and I would like to briefly describe some of the results here.

The problem we consider is that of clustering a point cloud in \mathbb{R}^m . One of the most popular method for this is the popular k-means or a variant of it called k-medians where the centers of the clusters need to be, themselves, data points.

Unfortunately, these problems are in general NP-hard and so it is common to rely on heuristics of approximation algorithms. While many of the popular iterative heuristics (such as the classical k-means algorithm) are incredibly efficient they do, oftentimes, get stuck in local minima and, worst of all, after running the algorithm it is, in general, hard to determine whether the computed solution is a local or global optima.

On the other hand, convex relaxations operate by enlarging the search space (to make the problem easy) and so the computed solution might  not be an admissible solution of the original problem. In this case, it is common to apply a rounding procedure to construct a solution to the original problem that, while possibly suboptimal, has performance guarantees.

The remarkable thing happens when, for whatever reason, the solution to the enlarged problem is also admissible in the original problem, as in that case, not only the algorithm computes the optimal solution to the original problem but also produces a peace-of-mind certificate of its optimality. For this reason, I believe that understanding when this phenomenon happens is an important problem, and have talked about it here.

In this paper we attempt to understand when this phenomenon happens for convex relaxations of k-means and k-medians. In particular, we are interested in comparing different convex relaxations for these clustering objectives not from the traditional perspective of  approximation ratio but instead from their ability to exactly recover the planted cluster structure on certain generative models of point clouds.

Inspired by earlier work of Rachel and Abhinav Nellore, we consider a simple setting: The point cloud (intended to have a planted k clustering) is generated by taking k unit radius spheres in \mathbb{R}^m and drawing n independent points on each of the spheres independently and uniformly on the sphere.

In order to hope to recover the planted clustering we need the minimum distance between the centers of the spheres \Delta to be larger than 2. Rachel and Abhinav Nellore had showed that for this model a Linear Programming convex relaxation of a certain k-medians objective was able to recover the spheres, with high probability, as long as this minimum distance satisfied \Delta > 3.75.

In this paper we show that, for arbitrarily large n, a linear programming relaxation of a certain k-medians objective is able to recover the spheres for \Delta \geq 2 + \epsilon, for any \epsilon>0.

We also investigate convex relaxations for a k-means objective. Interestingly, the natural linear programming relaxation works for \Delta \geq 2 + \sqrt{2}, but not below that. On the other hand, a more involved (yet still natural) semidefinite programming relaxation of it recovers all the way down to \Delta \geq 2 + \epsilon.

Although it is impossible to recover the cluster structure for \Delta < 2, it does not necessarily mean that the relaxation is not tight (meaning that it gives the optimal solution of the k-means or k-medians problem while not matching the planted solution), and in fact,  in some regimes, we see integrality of the relaxation even below \Delta < 2. However, our techniques seem to break down at this regime (for the same reason that usual techniques seem to break down when trying to solve this problem). I would love to see a result for such a regime! (Or similarly for a generative model with outliers where exact recovery is hopeless, such as mixtures of gaussians).

Leave a comment