The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Memory exploration for low power, embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
System-level power optimization: techniques and tools
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Energy-driven integrated hardware-software optimizations using SimplePower
Proceedings of the 27th annual international symposium on Computer architecture
Low power DSP's for wireless communications (embedded tutorial session)
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Loop Parallelization
Design of High-Performance Microprocessor Circuits
Design of High-Performance Microprocessor Circuits
Low Power Digital CMOS Design
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
DRAM Energy Management Using Sof ware and Hardware Directed Power Mode Control
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Optimizing Array-Intensive Applications for On-Chip Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Locality-conscious workload assignment for array-based computations in MPSOC architectures
Proceedings of the 42nd annual Design Automation Conference
Temperature-Sensitive Loop Parallelization for Chip Multiprocessors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Energy-aware computation duplication for improving reliability in embedded chip multiprocessors
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Optimizing code parallelization through a constraint network based approach
Proceedings of the 43rd annual Design Automation Conference
Adaptive multi-threading for dynamic workloads in embedded multiprocessors
SBCCI '10 Proceedings of the 23rd symposium on Integrated circuits and system design
Code scheduling for optimizing parallelism and data locality
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Data locality and parallelism optimization using a constraint-based approach
Journal of Parallel and Distributed Computing
Compile-Time energy optimization for parallel applications in on-chip multiprocessors
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Hi-index | 0.00 |
In this paper, we evaluate an adaptive loop parallelization strategy (i.e., a strategy that allows each loop nest to execute using different number of processors if doing so is beneficial) and measure the potential energy savings when unused processors during execution of a nested loop in a multi-processor on-a-chip (MPoC) are shut down (i.e., placed into a power-down or sleep state). Our results show that shutting down unused processors can lead to as much as 67% energy savings with up to 17% performance loss in a set of array-intensive applications. We also discuss and evaluate a processor pre-activation strategy based on compile-time analysis of nested loops. Based on our experiments, we conclude that an adaptive loop parallelization strategy combined with idle processor shut-down and pre-activation can be very effective in reducing energy consumption without increasing execution time.