Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
SH-5: The 64-Bit SuperH Architecture
IEEE Micro
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Dynamic overlay of scratchpad memory for energy minimization
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Data partitioning for maximal scratchpad usage
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
A dynamic code placement technique for scratchpad memory using postpass optimization
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
A scalable parallel H.264 decoder on the cell broadband engine architecture
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
SDRM: simultaneous determination of regions and function-to-region mapping for scratchpad memories
HiPC'08 Proceedings of the 15th international conference on High performance computing
Executing synchronous dataflow graphs on a SPM-based multicore architecture
Proceedings of the 49th Annual Design Automation Conference
A software-only scheme for managing heap data on limited local memory(LLM) multicore processors
ACM Transactions on Embedded Computing Systems (TECS)
Scheduling of synchronous data flow models onto scratchpad memory-based embedded processors
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 0.00 |
Many embedded processors incorporate scratchpad memories (SPM) due to their lower power consumption characteristics. SPMs are utilized to host both code and data, often on the same physical unit. Synchronous dataflow (SDF) is a popular format for specifying many embedded system applications particularly in multimedia and network processing domains. Execution of SDF specifications on SPM based processors involves division of memory between actor code and buffers, and scheduling of actor executions and code overlays such that latency is minimized subject to the memory constraints. In our problem instance a traditional minimum buffer SDF schedule could require a larger code overlay overhead and therefore may not be optimal. The paper presents a three stage integer linear programming (ILP) formulation to solve the problem. Further, the paper also introduces modifications to the three stage ILP for incorporating code prefetching optimization to further reduce the code overlay overhead. The effectiveness of the proposed approaches is evaluated by comparisons with minimum buffer SDF schedules for several benchmark applications.