Flexible approximate counting

Authors:
Scott A. Mitchell;David M. Day
Affiliations:
Sandia National Laboratories, Albuquerque, NM;Sandia National Laboratories, Albuquerque, NM
Venue:
Proceedings of the 15th Symposium on International Database Engineering & Applications
Year:
2011

Citing 14
Cited 0

Approximate counting: a detailed analysis

BIT - Ellis Horwood series in artificial intelligence
Analysis of a splitting process arising in probabilistic counting and other related algorithms

Random Structures & Algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms

The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Counting large numbers of events in small registers

Communications of the ACM
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Spectral bloom filters

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Primitive Polynomials Over GF(2) of Degree up to 660 with Uniformly Distributed Coefficients

Journal of Electronic Testing: Theory and Applications
Approximate counts and quantiles over sliding windows

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Generalized approximate counting revisited

Theoretical Computer Science
Succinct approximate counting of skewed data

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Probabilistic counting with randomized storage

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Approximate counting with a floating-point counter

COCOON'10 Proceedings of the 16th annual international conference on Computing and combinatorics
Counting by coin tossings

ASIAN'04 Proceedings of the 9th Asian Computing Science conference on Advances in Computer Science: dedicated to Jean-Louis Lassez on the Occasion of His 5th Cycle Birthday

Quantified Score

Hi-index	0.00

Visualization

Abstract

Approximate counting [18] is useful for data stream and database summarization. It can help in many settings that allow only one pass over the data, want low memory usage, and can accept some relative error. Approximate counters use fewer bits; we focus on 8-bits but our results are general. These small counters represent a sparse sequence of larger numbers. Counters are incremented probabilistically based on the spacing between the numbers they represent. Our contributions are a customized distribution of counter values and efficient strategies for deciding when to increment them. At run-time, users may independently select the spacing (accuracy) of the approximate counter for small, medium, and large values. We allow the user to select the maximum number to count up to, and our algorithm will select the exponential base of the spacing. These provide additional flexibility over both classic and Csűrös's [4] floating-point approximate counting. These provide additional structure, a useful schema for users, over Kruskal and Greenberg [13]. We describe two new and efficient strategies for incrementing approximate counters: use a deterministic countdown or sample from a geometric distribution. In Csűrös all increments are powers of two, so random bits rather than full random numbers can be used. We also provide the option to use powers-of-two but retain flexibility. We show when each strategy is fastest in our implementation.