Improved range-summable random variable construction algorithms

Authors:
A. R. Calderbank;A. Gilbert;K. Levchenko;S. Muthukrishnan;M. Strauss
Affiliations:
Princeton University, Princeton, New Jersey;University of Michigan, Ann Arbor, Michigan;University of California San Diego, La Jolla, California;Rutgers University, Piscataway, New Jersey;University of Michigan, Ann Arbor, Michigan
Venue:
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Year:
2005

Citing 13
Cited 6

Small-bias probability spaces: efficient constructions and applications

SIAM Journal on Computing
The amazing power of pairwise independence (abstract)

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
The space complexity of approximating the frequency moments

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Tracking join and self-join sizes in limited storage

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast, small-space algorithms for approximate histogram maintenance

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Reductions in streaming algorithms, with an application to counting triangles in graphs

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Rangesum histograms

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Finding Frequent Items in Data Streams

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
One-Pass Wavelet Decompositions of Data Streams

IEEE Transactions on Knowledge and Data Engineering
An Approximate L1-Difference Algorithm for Massive Data Streams

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
A complexity theoretic approach to randomness

STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Pairwise Independence and Derandomization

Pairwise Independence and Derandomization

Range-Efficient Computation of F" over Massive Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Fast range-summable random variables for efficient aggregate estimation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Data streams: algorithms and applications

Foundations and Trends® in Theoretical Computer Science
Pseudo-random number generation for sketch-based estimations

ACM Transactions on Database Systems (TODS)
Near-optimal private approximation protocols via a black box transformation

Proceedings of the forty-third annual ACM symposium on Theory of computing
Rectangle-efficient aggregation in spatial data streams

PODS '12 Proceedings of the 31st symposium on Principles of Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Range-summable universal hash functions, also known as range-summable random variables, are binary-valued hash functions which can efficiently hash single values as well as ranges of values from the domain. They have found several applications in the area of data stream processing where they are used to construct sketches---small-space summaries of the input sequence.We present two new constructions of range-summable universal hash functions on n-bit strings, one based on Reed-Muller codes which gives k-universal hashing using O(nlog k) space and time for point operations and O(n2 log k) for range operations, and another based on a new subcode of the second-order Reed-Muller code, which gives 5-universal hashing using O(n) space, O(n log3 n) time for point operations, and O(n3) time for range operations.We also present a new sketch data structure using the new hash functions which improves several previous results.