Constructing application-specific memory hierarchies on FPGAs

Authors:
Harald Devos;Jan Van Campenhout;Ingrid Verbauwhede;Dirk Stroobandt
Affiliations:
Parallel Information Systems, ELIS-Dept., Ghent University, Gent, Belgium;Parallel Information Systems, ELIS-Dept., Ghent University, Gent, Belgium;Katholieke Universiteit Leuven, ESAT, Leuven-Heverlee, Belgium;Parallel Information Systems, ELIS-Dept., Ghent University, Gent, Belgium
Venue:
Transactions on high-performance embedded architectures and compilers III
Year:
2011

Citing 18
Cited 0

A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
High-level address optimization and synthesis techniques for data-transfer-intensive applications

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Data prefetch mechanisms

ACM Computing Surveys (CSUR)
Data and memory optimization techniques for embedded systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The Impulse Memory Controller

IEEE Transactions on Computers
Mapping a Single Assignment Programming Language to Reconfigurable Systems

The Journal of Supercomputing
Vi iMproved

Vi iMproved
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
Applications of storage mapping optimization to register promotion

Proceedings of the 18th annual international conference on Supercomputing
Optimized Generation of Data-Path from C Codes for FPGAs

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Expression Synthesis in Process Networks generated by LAURA

ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Improving the memory behavior of vertical filtering in the discrete wavelet transform

Proceedings of the 3rd conference on Computing frontiers
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies

International Journal of Parallel Programming
Incremental hierarchical memory size estimation for steering of loop transformations

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Finding and Applying Loop Transformations for Generating Optimized FPGA Implementations

Transactions on High-Performance Embedded Architectures and Compilers I
Strength reduction of integer division and modulo operations

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Scalable, Wavelet-Based Video: From Server to Hardware-Accelerated Client

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

The high performance potential of an FPGA is not fully exploited if a design suffers a memory bottleneck. Therefore, a memory hierarchy is needed to reuse data in on-chip buffer memories and minimize the number of accesses to off-chip memory. Buffer memories not only hide the external memory latency, but can also be used to remap data and augment the on-chip bandwidth through parallel access of multiple buffers. This paper discusses the differences and similarities of memory hierarchies on processor- and on FPGA-based systems and presents a step-by-step methodology to construct a memory hierarchy on an FPGA.