Approximate counting: a detailed analysis
BIT - Ellis Horwood series in artificial intelligence
Counting large numbers of events in small registers
Communications of the ACM
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice
ACM Transactions on Computer Systems (TOCS)
Proceedings of the 15th Symposium on International Database Engineering & Applications
Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Hi-index | 0.00 |
When many objects are counted simultaneously in large data streams, as in the course of network traffic monitoring, or Webgraph and molecular sequence analyses, memory becomes a limiting factor. Robert Morris [Communications of the ACM, 21:840-842, 1978] proposed a probabilistic technique for approximate counting that is extremely economical. The basic idea is to increment a counter containing the value X with probability 2-X. As a result, the counter contains an approximation of lg n after n probabilistic updates, stored in lg lg n bits. Here we revisit the original idea of Morris. We introduce a binary floating-point counter that combines a d-bit significand with a binary exponent, stored together on d + lglg n bits. The counter yields a simple formula for an unbiased estimation of n with a standard deviation of about 0.6 ċ n2-d/2. We analyze the floating-point counter's performance in a general framework that applies to any probabilistic counter. In that framework, we provide practical formulas to construct unbiased estimates, and to assess the asymptotic accuracy of any counter.