Optimal Semijoins for Distributed Database Systems
IEEE Transactions on Software Engineering
Design and validation of computer protocols
Design and validation of computer protocols
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Cache-oblivious streaming B-trees
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
An improved construction for counting bloom filters
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Compact Hash Tables Using Bidirectional Linear Probing
IEEE Transactions on Computers
Avoiding the disk bottleneck in the data domain deduplication file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Counting Data Stream Based on Improved Counting Bloom Filter
WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell
PADL '09 Proceedings of the 11th International Symposium on Practical Aspects of Declarative Languages
A Reconfigurable Bloom Filter Architecture for BLASTN
ARCS '09 Proceedings of the 22nd International Conference on Architecture of Computing Systems
LazyBase: freshness vs. performance in information management
ACM SIGOPS Operating Systems Review
Cassandra: a decentralized structured storage system
ACM SIGOPS Operating Systems Review
bLSM: a general purpose log structured merge tree
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
TBF: a high-efficient query mechanism in de-duplication backup system
GPC'12 Proceedings of the 7th international conference on Advances in Grid and Pervasive Computing
Hi-index | 0.00 |
Many large storage systems use approximate-membership-query (AMQ) data structures to deal with the massive amounts of data that they process. An AMQ data structure is a dictionary that trades off space for a false positive rate on membership queries. It is designed to fit into small, fast storage, and it is used to avoid I/Os on slow storage. The Bloom filter is a well-known example of an AMQ data structure. Bloom filters, however, do not scale outside of main memory. This paper describes the Cascade Filter, an AMQ data structure that scales beyond main memory, supporting over half a million insertions/deletions per second and over 500 lookups per second on a commodity flash-based SSD.