Probably Almost Discriminative Learning
Machine Learning
Matrix computations (3rd ed.)
Property testing and its connection to learning and approximation
Journal of the ACM (JACM)
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Min-wise independent permutations
Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
Sampling algorithms: lower bounds and applications
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Robust Characterizations of Polynomials withApplications to Program Testing
SIAM Journal on Computing
An Approximate Lp-Difference Algorithm for Massive Data Streams
STACS '00 Proceedings of the 17th Annual Symposium on Theoretical Aspects of Computer Science
A Complete Promise Problem for Statistical Zero-Knowledge
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Efficient Testing of Large Graphs
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Testing that distributions are close
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Testing Random Variables for Independence and Identity
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Three theorems regarding testing graph properties
Random Structures & Algorithms
Sublinear algorithms for testing monotone and unimodal distributions
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
The Complexity of Approximating the Entropy
SIAM Journal on Computing
Testing k-wise and almost k-wise independence
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Estimating entropy over data streams
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Approximating entropy from sublinear samples
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Testing Expansion in Bounded-Degree Graphs
FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
Declaring independence via the sketching of sketches
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Testing symmetric properties of distributions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Sketching information divergences
Machine Learning
An Expansion Tester for Bounded Degree Graphs
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Testing monotone high-dimensional distributions
Random Structures & Algorithms - Proceedings of the Thirteenth International Conference “Random Structures and Algorithms” held May 28–June 1, 2007, Tel Aviv, Israel
Sublinear estimation of entropy and information distances
ACM Transactions on Algorithms (TALG)
Strong Lower Bounds for Approximating Distribution Support Size and the Distinct Elements Problem
SIAM Journal on Computing
A near-optimal algorithm for estimating the entropy of a stream
ACM Transactions on Algorithms (TALG)
Measuring independence of datasets
Proceedings of the forty-second ACM symposium on Theory of computing
Proceedings of the forty-second ACM symposium on Theory of computing
Testing monotone continuous distributions on high-dimensional real cubes
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Testing non-uniform k-wise independent distributions over product spaces
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Estimating entropy and entropy norm on data streams
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
A Coincidence-Based Test for Uniformity Given Very Sparsely Sampled Discrete Data
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Given samples from two distributions over an n-element set, we wish to test whether these distributions are statistically close. We present an algorithm which uses sublinear in n, specifically, O(n2/3ε−8/3 log n), independent samples from each distribution, runs in time linear in the sample size, makes no assumptions about the structure of the distributions, and distinguishes the cases when the distance between the distributions is small (less than {ε4/3n−1/3/32, εn−1/2/4}) or large (more than ε) in ℓ1 distance. This result can be compared to the lower bound of Ω(n2/3ε−2/3) for this problem given by Valiant [2008]. Our algorithm has applications to the problem of testing whether a given Markov process is rapidly mixing. We present sublinear algorithms for several variants of this problem as well.