More analysis of double hashing
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Practical performance of Bloom filters and parallel free-text searching
Communications of the ACM
The analysis of closed hashing under limited randomness
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Balls and bins: a study in negative dependence
Random Structures & Algorithms
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
IEEE/ACM Transactions on Networking (TON)
An optimal Bloom filter replacement
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
On the false-positive rate of Bloom filters
Information Processing Letters
Building high accuracy bloom filters using partitioned hashing
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Why simple hash functions work: exploiting the entropy in a data stream
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Simple summaries for hashing with choices
IEEE/ACM Transactions on Networking (TON)
Bloom filter based routing for content-based publish/subscribe
Proceedings of the second international conference on Distributed event-based systems
Optimizing data popularity conscious bloom filters
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell
PADL '09 Proceedings of the 11th International Symposium on Practical Aspects of Declarative Languages
A sequential indexing scheme for flash-based embedded systems
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Towards a new generation of information-oriented internetworking architectures
CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
Cache-, hash-, and space-efficient bloom filters
Journal of Experimental Algorithmics (JEA)
A 1 cycle-per-byte XML parsing accelerator
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Cache-, hash- and space-efficient bloom filters
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
Removing the redundancy from distributed semantic web data
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Protecting against DNS reflection attacks with Bloom filters
DIMVA'11 Proceedings of the 8th international conference on Detection of intrusions and malware, and vulnerability assessment
bLSM: a general purpose log structured merge tree
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
PBFilter: A flash-based indexing scheme for embedded systems
Information Systems
Duplicate detection in pay-per-click streams using temporal stateful Bloom filters
International Journal of Data Analysis Techniques and Strategies
Space-efficient and exact de bruijn graph representation based on a bloom filter
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Toward intersection filter-based optimization for joins in MapReduce
Proceedings of the 2nd International Workshop on Cloud Intelligence
Sketching for big data recommender systems using fast pseudo-random fingerprints
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II
Hi-index | 0.00 |
A standard technique from the hashing literature is to use two hash functions h1(x) and h2(x) to simulate additional hash functions of the form gi(x) = h1(x) + i h2(x). We demonstrate that this technique can be usefully applied to Bloom filters and related data structures. Specifically, only two hash functions are necessary to effectively implement a Bloom filter without any loss in the asymptotic false positive probability. This leads to less computation and potentially less need for randomness in practice.