Space lower bounds for distance approximation in the data stream model
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An Information Statistics Approach to Data Stream and Communication Complexity
FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Incremental Support Vector Machine Construction
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Optimal space lower bounds for all frequency moments
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
On demand classification of data streams
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimal approximations of the frequency moments of data streams
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Profiling internet backbone traffic: behavior models and applications
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Mining anomalies using traffic feature distributions
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Data streaming algorithms for estimating entropy of network traffic
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Impact of packet sampling on anomaly detection metrics
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Computational Linguistics
A data streaming algorithm for estimating entropies of od flows
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Nonlinear Estimators and Tail Bounds for Dimension Reduction in l1 Using Cauchy Random Projections
The Journal of Machine Learning Research
Entropy of search logs: how hard is search? with personalization? with backoff?
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Estimators and tail bounds for dimension reduction in lα (0
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
On Estimating Frequency Moments of Data Streams
APPROX '07/RANDOM '07 Proceedings of the 10th International Workshop on Approximation and the 11th International Workshop on Randomization, and Combinatorial Optimization. Algorithms and Techniques
Sketching and Streaming Entropy via Approximation Theory
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Computationally Efficient Estimators for Dimension Reductions Using Stable Random Projections
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Towards a universal sketch for origin-destination network measurements
NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
Exact sparse recovery with L0 projections
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Compressed Counting (CC) [22] was recently proposed for estimating the αth frequency moments of data streams, where 0 This paper presents a new algorithm for improving CC. The improvement is most substantial when α → 1--. For example, when α = 0.99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when α = 0.5.