PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Bus-invert coding for low-power I/O
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Achieving Full Parallelism Using Multidimensional Retiming
IEEE Transactions on Parallel and Distributed Systems
Automatically partitioning threads for multithreaded architectures
Journal of Parallel and Distributed Computing - Special issue on compilation and architectural support for parallel applications
Influence of compiler optimizations on system power
Proceedings of the 37th Annual Design Automation Conference
Low-energy intra-task voltage scheduling using static timing analysis
Proceedings of the 38th annual Design Automation Conference
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Energy-conscious compilation based on voltage scaling
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Task scheduling and voltage selection for energy minimization
Proceedings of the 39th annual Design Automation Conference
Power and performance evaluation of globally asynchronous locally synchronous processors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Managing power and performance for System-on-Chip designs using Voltage Islands
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Dynamic frequency and voltage control for a multiple clock domain microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Architecting voltage islands in core-based system-on-a-chip designs
Proceedings of the 2004 international symposium on Low power electronics and design
Architecting voltage islands in core-based system-on-a-chip designs
Proceedings of the 2004 international symposium on Low power electronics and design
General loop fusion technique for nested loops considering timing and code size
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Journal of VLSI Signal Processing Systems
Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP
Journal of Parallel and Distributed Computing
Optimal loop parallelization for maximizing iteration-level parallelism
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Dynamic and leakage energy minimization with soft real-time loop scheduling and voltage assignment
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Loop distribution and fusion with timing and code size optimization for embedded DSPs
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Rotation scheduling: a loop pipelining algorithm
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Automatic extraction of multi-objective aware pipeline parallelism using genetic algorithms
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Multi-objective aware extraction of task-level parallelism using genetic algorithms
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
With the advance of semiconductor, multi-core architecture is inevitable in today's embedded system design. Nested loops are usually the most critical part in multimedia and high performance DSP (Digital Signal Processing) systems. Hence, maximizing loop parallelism is an important issue to improve the performance of a modern compiler. This paper studies how to maximize the system performance with the consideration of energy reduction for applications with multidimensional nested loops on multi-core DSP architectures. An algorithm, EALPM (Energy-Aware Loop Parallelism Maximization), is proposed in this paper. We implemented a two phase strategy. First, the strategy uses retiming and loop transformation to parallelize nested loops. Then the strategy employs a novel voltage assignment algorithm to reduce total energy consumption. The experimental results show that on average, both performance and energy-saving can be significantly improved using the EALPM algorithm.