Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Pseudorandom generators for space-bounded computations
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Optimal space lower bounds for all frequency moments
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Tabulation based 4-universal hashing with applications to second moment estimation
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal approximations of the frequency moments of data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Very sparse random projections
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating entropy over data streams
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Estimators and tail bounds for dimension reduction in lα (0
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Sketching information divergences
Machine Learning
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Proceedings of the forty-second ACM symposium on Theory of computing
On the exact space complexity of sketching and streaming small norms
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Estimating hybrid frequency moments of data streams
Journal of Combinatorial Optimization
Tight lower bound for linear sketches of moments
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I
Hi-index | 0.00 |
Space-economical estimation of the pth frequency moments, defined as , for p 0, are of interest in estimating all-pairs distances in a large data matrix [14], machine learning, and in data stream computation. Random sketches formed by the inner product of the frequency vector f1, ..., fnwith a suitably chosen random vector were pioneered by Alon, Matias and Szegedy [1], and have since played a central role in estimating Fpand for data stream computations in general. The concept of p-stable sketches formed by the inner product of the frequency vector with a random vector whose components are drawn from a p-stable distribution, was proposed by Indyk for estimating Fp, for 0 pIn this paper, we consider the problem of estimating Fp, for 0 pFp, for 0 psketch [7] and the structure [5]. Our algorithms require space $\tilde{O}(\frac{1}{\epsilon^{2+p}})$ to estimate Fpto within 1 ±茂戮驴factors and requires expected time $O(\log F_1 \log \frac{1}{\delta})$ to process each update. Thus, our technique trades an $O(\frac{1}{\epsilon^p})$ factor in space for much more efficient processing of stream updates. We also present a stand-alone iterative estimator for F1.