Built-in test for VLSI: pseudorandom techniques
Built-in test for VLSI: pseudorandom techniques
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Bloom filtering cache misses for accurate data speculation and prefetching
ICS '02 Proceedings of the 16th international conference on Supercomputing
Long and Fast Up/Down Counters
IEEE Transactions on Computers
Synchronous Up/Down Counter with Clock Period Independent of Counter Size
ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
MTDT '99 Proceedings of the 1999 IEEE International Workshop on Memory Technology, Design, and Testing
Leakage power modeling and optimization in interconnection networks
Proceedings of the 2003 international symposium on Low power electronics and design
JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Analytical Models for Leakage Power Estimation of Memory Array Structures
CODES+ISSS '04 Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
Proceedings of the 32nd annual international symposium on Computer Architecture
L-CBF: a low-power, fast counting bloom filter architecture
Proceedings of the 2006 international symposium on Low power electronics and design
Analysis and Design of Digital Integrated Circuits
Analysis and Design of Digital Integrated Circuits
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
Hi-index | 0.00 |
An increasing number of architectural techniques have relied on hardware counting bloom filters (CBFs) to improve upon the energy, delay, and complexity of various processor structures. CBFs improve the energy and speed of membership tests by maintaining an imprecise and compact representation of a large set to be searched. This paper studies the energy, delay, and area characteristics of two implementations for CBFs using full custom layouts in a commercial 0.13-µm fabrication technology. One implementation, S-CBF, uses an SRAM array of counts and a shared up/down counter. Our proposed implementation, L-CBF, utilizes an array of up/down linear feedback shift registers and local zero detectors. Circuit simulations show that for a 1 K-entry CBF with a 15-bit count per entry, L-CBF compared to S-CBF is 3.7 × or 1.6 × faster and requires 2.3 × or 1.4 × less energy depending on the operation. Additionally, this paper presents analytical energy and delay models for L-CBF. These models can estimate energy and delay of various CBF organizations during architectural level explorations when a physical level implementation is not available. Our results demonstrate that for a variety of L-CBF organizations, the estimations by analytical models are within 5% and 10% of Spectre simulation results for delay and energy, respectively.