Property testing and its connection to learning and approximation
Journal of the ACM (JACM)
Testing problems with sublearning sample complexity
Journal of Computer and System Sciences
Sublinear algorithms for testing monotone and unimodal distributions
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Property Testing: A Learning Theory Perspective
Foundations and Trends® in Machine Learning
Learning poisson binomial distributions
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Learning mixtures of arbitrary distributions over large discrete domains
Proceedings of the 5th conference on Innovations in theoretical computer science
Hi-index | 0.00 |
A k-modal probability distribution over the domain {1,..., n} is one whose histogram has at most k "peaks" and "valleys." Such distributions are natural generalizations of monotone (k = 0) and unimodal (k = 1) probability distributions, which have been intensively studied in probability theory and statistics. In this paper we consider the problem of learning an unknown k-modal distribution. The learning algorithm is given access to independent samples drawn from the k-modal distribution p, and must output a hypothesis distribution p such that with high probability the total variation distance between p and p is at most ε. We give an efficient algorithm for this problem that runs in time poly(k, log(n), 1/ε). For k ≤ Õ(√ log n), the number of samples used by our algorithm is very close (within an Õ(log(1/ε)) factor) to being information-theoretically optimal. Prior to this work computationally efficient algorithms were known only for the cases k = 0, 1 [Bir87b, Bir97]. A novel feature of our approach is that our learning algorithm crucially uses a new property testing algorithm as a key subroutine. The learning algorithm uses the property tester to efficiently decompose the k-modal distribution into k (near)-monotone distributions, which are easier to learn.