Learning poisson binomial distributions

Authors:
Constantinos Daskalakis;Ilias Diakonikolas;Rocco A. Servedio
Affiliations:
MIT, Boston, USA;University of California, Berkeley, USA;Columbia University, New York, USA
Venue:
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Year:
2012

Citing 12
Cited 0

Faster computation of Bernoulli numbers

Journal of Algorithms
On the learnability of discrete distributions

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Fast Multiple-Precision Evaluation of Elementary Functions

Journal of the ACM (JACM)
Probability and Statistics for Computer Science

Probability and Statistics for Computer Science
An Efficient PTAS for Two-Strategy Anonymous Games

WINE '08 Proceedings of the 4th International Workshop on Internet and Network Economics
On oblivious PTAS's for nash equilibrium

Proceedings of the forty-first annual ACM symposium on Theory of computing
Concentration of Measure for the Analysis of Randomized Algorithms

Concentration of Measure for the Analysis of Randomized Algorithms
Efficiently learning mixtures of two Gaussians

Proceedings of the forty-second ACM symposium on Theory of computing
Settling the Polynomial Learnability of Mixtures of Gaussians

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Polynomial Learning of Distribution Families

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs

Proceedings of the forty-third annual ACM symposium on Theory of computing
Learning k-modal distributions via testing

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a basic problem in unsupervised learning: learning an unknown Poisson Binomial Distribution. A Poisson Binomial Distribution (PBD) over {0,1,...,n} is the distribution of a sum of n independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 and are a natural n-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our main result we give a highly efficient algorithm which learns to ε-accuracy using O(1/ε3) samples independent of n. The running time of the algorithm is quasilinear in the size of its input data, i.e. ~O(log(n)/ε3) bit-operations (observe that each draw from the distribution is a log(n)-bit string). This is nearly optimal since any algorithm must use Ω(1/ε2) samples. We also give positive and negative results for some extensions of this learning problem.