Area-efficient near-associative memories on FPGAs

Authors:
Udit Dhawan;André DeHon
Affiliations:
University of Pennsylvania, Philadelphia, PA, USA;University of Pennsylvania, Philadelphia, PA, USA
Venue:
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Year:
2013

Citing 15
Cited 1

Summary cache: a scalable wide-area web cache sharing protocol

IEEE/ACM Transactions on Networking (TON)
Studying Balanced Allocations with Differential Equations

Combinatorics, Probability and Computing
The Bloomier filter: an efficient data structure for static support lookup tables

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
In-System FPGA Prototyping of an Itanium Microarchitecture

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Fast hash table lookup using extended bloom filter: an aid to network processing

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
SPEC CPU2006 benchmark descriptions

ACM SIGARCH Computer Architecture News
A practical FPGA-based framework for novel CMP research

Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
RAMP: Research Accelerator for Multiple Processors

IEEE Micro
A Desktop Computer with a Reconfigurable Pentium®

ACM Transactions on Reconfigurable Technology and Systems (TRETS) - Special edition on the 15th international symposium on FPGAs
Implementing an OpenFlow switch on the NetFPGA platform

Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
The ZCache: Decoupling Ways and Associativity

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The power of one move: hashing schemes for hardware

IEEE/ACM Transactions on Networking (TON)
OCTAVO: an FPGA-centric processor family

Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Flexible register management using reference counting

HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Exploration and Customization of FPGA-Based Soft Processors

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

A verified information-flow architecture

Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7× less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100×. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.