Array allocation taking into account SDRAM characteristics
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Memory Architectures for Embedded Systems-On-Chip
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Memory allocation and mapping in high-level synthesis: an integrated approach
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Time-Energy Design Space Exploration for Multi-Layer Memory Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Proceedings of the 42nd annual Design Automation Conference
Optimizing Wait States in the Synthesis of Memory References with Unpredictable Latencies
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 0.03 |
Memory-intensive behaviors often contain large arrays that are synthesized into off-chip memories. With the increasing gap between on-chip and off-chip memory access delays, it is imperative to exploit the efficient access mode features of modern-day memories (e.g., page-mode DRAM's) in order to alleviate the memory bandwidth bottleneck. Although recent research efforts in high-level synthesis (HLS) have addressed the issue of memory-based synthesis, current techniques are unable to exploit efficiently the special access modes of these off-chip memories, resulting in significantly inferior performance using these memory library parts. Our work addresses this issue by (a) modeling realistic off-chip memory access modes for HLS, (b) presenting algorithms to infer applicability of HLS with these memory access modes, and (c) transforming input behavior to provide further memory access optimizations during HLS. We demonstrate the utility of our approach using a suite of memory-intensive benchmarks with a realistic DRAM library module. Experimental results show a significant performance improvement (more than 40%) as a result of our optimization techniques