Introduction to algorithms
Exploiting dual data-memory banks in digital signal processors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Code generation for fixed-point DSPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Simultaneous reference allocation in code generation for dual data memory bank ASIPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Memory bank customization and assignment in behavioral synthesis
ICCAD '99 Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design
The very portable optimizer for digital signal processors
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler Support for Scalable and Efficient Memory Systems
IEEE Transactions on Computers
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Retargetable Compilers for Embedded Core Processors: Methods and Experience in Industrial Applications
DSP Processors Hit the Mainstream
Computer
Code optimization libraries for retargetable compilation for embedded digital signal processors
Code optimization libraries for retargetable compilation for embedded digital signal processors
Variable partitioning for dual memory bank DSPs
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
Minimizing bank selection instructions for partitioned memory architecture
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Minimal placement of bank selection instructions for partitioned memory architectures
ACM Transactions on Embedded Computing Systems (TECS)
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
An array allocation scheme for energy reduction in partitioned memory architectures
CC'07 Proceedings of the 16th international conference on Compiler construction
Run-Time memory optimization for DDMB architecture through a CCB algorithm
EUC'06 Proceedings of the 2006 international conference on Emerging Directions in Embedded and Ubiquitous Computing
Journal of Combinatorial Optimization
Minimizing code size via page selection optimization on partitioned memory architectures
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
Most vendors of digital signal processors (DSPs) support a Harvard architecture, which has two or more memory buses, one for program and one or more for data and allow the processor to access multiple words of data from memory in a single instruction cycle. Also, many existing fixed-point DSPs are known to have an irregular architecture with heterogeneous registers, which contains multiple register files that are distributed and dedicated to different sets of instructions. Although there have been several studies conducted to efficiently assign data to multimemory banks, most of them assumed processors with relatively simple, homogeneous general-purpose registers. Thus, several vendor-provided compilers for DSPs that we examined were unable to efficiently assign data to multiple data memory banks, thereby often failing to generate highly optimized code for their machines. As a consequence, programmers for these DSPs often manually assign program variables to memories so as to fully utilize multimemory banks in their code. This paper reports on our recent attempt to address this problem by presenting an algorithm that helps the compiler to efficiently assign data to multimemory banks. Our algorithm differs from previous work in that it assigns variables to memory banks in separate, decoupled code generation phases, instead of a single, tightly coupled phase. The experimental results have revealed that our decoupled algorithm greatly simplifies our code generation process; thus our compiler runs extremely fast, yet generates target code that is comparable in quality to the code generated by a coupled approach.