Exploiting Processor Workload Heterogeneity for Reducing Energy Consumption in Chip Multiprocessors

Authors:
I. Kadayif;M. Kandemir;I. Kolcu
Affiliations:
-;-;-
Venue:
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Year:
2004

Citing 14
Cited 13

Compiler optimizations for improving data locality

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Compiler optimizations for eliminating barrier synchronization

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Comparing algorithm for dynamic speed-setting of a low-power CPU

MobiCom '95 Proceedings of the 1st annual international conference on Mobile computing and networking
The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Cache miss equations: an analytical representation of cache misses

ICS '97 Proceedings of the 11th international conference on Supercomputing
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
Energy-conscious compilation based on voltage scaling

Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
An integer linear programming based approach for parallelizing applications in On-chip multiprocessors

Proceedings of the 39th annual Design Automation Conference
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Power optimization of real-time embedded systems on variable speed processors

Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
Real-Time Task Scheduling for a Variable Voltage Processor

Proceedings of the 12th international symposium on System synthesis
Policies for dynamic clock scheduling

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Scheduling for reduced CPU energy

OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Dynamic voltage and frequency scaling for scientific applications

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing

Optimizing Array-Intensive Applications for On-Chip Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Locality-conscious workload assignment for array-based computations in MPSOC architectures

Proceedings of the 42nd annual Design Automation Conference
Software-directed power-aware interconnection networks

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Power-performance considerations of parallel computing on chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Software-directed power-aware interconnection networks

ACM Transactions on Architecture and Code Optimization (TACO)
Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Accomodating Diversity in CMPs with Heterogeneous Frequencies

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Operating system scheduling for efficient online self-test in robust systems

Proceedings of the 2009 International Conference on Computer-Aided Design
Comparing scalability prediction strategies on an SMP of CMPs

EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
An effective speedup metric for measuring productivity in large-scale parallel computer systems

The Journal of Supercomputing
An energy-efficient heterogeneous CMP based on hybrid TFET-CMOS cores

Proceedings of the 48th Design Automation Conference
Compile-Time energy optimization for parallel applications in on-chip multiprocessors

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Evaluation of Low-Power Computing when Operating on Subsets of Multicore Processors

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Advances in semiconductor technology are enabling designs with several hundred million transistors. Since building sophisticated single processor based systems is a complex process from design, veri.cation, and software development perspectives, the use of chip multiprocessing is inevitable in future microprocessors. In fact, the abundance of explicit loop-level parallelism in many embedded applications helps us identify chip multiprocessing as one of the most promising directions in designing systems for embedded applications. Another architectural trend that we observe in embedded systems, namely, multi-voltage processors, is driven by the need of reducing energy consumption during program execution. Practical implementations such as Transmeta's Crusoe and Intel's XScale tune processor voltage/frequency depending on current execution load. Considering these two trends, chip multiprocessing and voltage/frequency scaling, this paper presents an optimization strategy for an architecture that makes use of both chip parallelism and voltage scaling. In our proposal, the compiler takes advantage of heterogeneity in parallel execution between the loads of different processors and assigns different voltages/frequencies to different processors if doingso reduces energy consumption without increasing overall execution cycles signi.cantly. Our experiments with a set of applications show that this optimization can bring large energy benefits without much performance loss.