We consider mixtures of k≥2 Gaussian components with
unknown means and unknown covariance (identical for all
components) that are well-separated, i.e., distinct components
have statistical overlap at most k-C for a large enough
constant C≥1.
Previous statistical-query lower bounds
[Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart,
Statistical query lower bounds for robust estimation of
high-dimensional Gaussians and Gaussian mixtures (extended
abstract),
58th Annual IEEE Symposium on Foundations of
Computer Science—FOCS 2017, pp. 73–84]
give formal evidence that,
even for the special case of colinear means, distinguishing such
mixtures from (pure) Gaussians may be exponentially hard (in k).
We show that, surprisingly, this kind of hardness can only appear
if mixing weights are allowed to be exponentially small. For
polynomially lower bounded mixing weights, we show how to achieve
non-trivial statistical guarantees in quasi-polynomial time.
Concretely, we develop an algorithm based on the sum-of-squares
method with running time quasi-polynomial in the minimum mixing
weight. The algorithm can reliably distinguish between a mixture
of k≥2 well-separated Gaussian components and a (pure) Gaussian
distribution. As a certificate, the algorithm computes a
bipartition of the input sample that separates some pairs of
mixture components, i.e., both sides of the bipartition contain
most of the sample points of at least one component.
For the special case of colinear means, our algorithm outputs a
k-clustering of the input sample that is approximately consistent
with all components of the underlying mixture. We obtain similar
clustering guarantees also for the case that the overlap between
any two mixture components is lower bounded quasi-polynomially in
k (in addition to being upper bounded polynomially in k).
A significant challenge for our results is that they appear to be
inherently sensitive to small fractions of adversarial outliers
unlike most previous algorithmic results for Gaussian mixtures.
The reason is that such outliers can simulate exponentially small
mixing weights even for mixtures with polynomially lower bounded
mixing weights.
A key technical ingredient of our algorithms is a characterization
of separating directions for well-separated Gaussian components in
terms of ratios of polynomials that correspond to moments of two
carefully chosen orders logarithmic in the minimum mixing weight.