Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Logic synthesis
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Some simplified NP-complete problems
STOC '74 Proceedings of the sixth annual ACM symposium on Theory of computing
Memory allocation and mapping in high-level synthesis: an integrated approach
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Techniques for synthesizing binaries to an advanced register/memory structure
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Automatic pool allocation: improving performance by controlling data structure layout in the heap
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Whole execution traces and their applications
ACM Transactions on Architecture and Code Optimization (TACO)
Memory access pattern analysis and stream cache design for multimedia applications
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies
ACM Transactions on Programming Languages and Systems (TOPLAS)
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Forma: A framework for safe automatic array reshaping
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic On-chip Memory Minimization for Data Reuse
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
MPADS: memory-pooling-assisted data splitting
Proceedings of the 7th international symposium on Memory management
Compilation Techniques for Reconfigurable Architectures
Compilation Techniques for Reconfigurable Architectures
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Using memory profile analysis for automatic synthesis of pointers code
ACM Transactions on Embedded Computing Systems (TECS)
Quipu: A Statistical Model for Predicting Hardware Resources
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Memory partitioning for multidimensional arrays in high-level synthesis
Proceedings of the 50th Annual Design Automation Conference
The benefits of using variable-length pipelined operations in high-level synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
In high-level synthesis, pipelined designs are often restricted by the number of memory banks available to the synthesis system. Using multiple memory banks can improve the performance of accelerated applications. Currently, programmers must manually assign data structures to specific memory banks on the accelerator. This paper describes Automatic Memory Partitioning, a method for automatically partitioning data structures into multiple memory banks for increased parallelism and performance. We use source code instrumentation to collect memory traces in order to detect linear memory access patterns. The memory traces are used to split data structures into disjoint memory regions and determine which segments may benefit from parallel memory access. We present an ILP based algorithm for allocating memory segments into multiple memory banks. Experiments show significant improvements in performance while using a minimal number of memory banks.