Power-performance considerations of parallel computing on chip multiprocessors

Authors:
Jian Li;José F. Martínez
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2005

Citing 29
Cited 12

The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Comparing power consumption of an SMT and a CMP DSP for mobile phone workloads

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
An integer linear programming based approach for parallelizing applications in On-chip multiprocessors

Proceedings of the 39th annual Design Automation Conference
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
A Scalable Parallel Algorithm for Self-Organizing Maps with Applicationsto Sparse Data Mining Problems

Data Mining and Knowledge Discovery
Power: A First-Class Architectural Design Constraint

Computer
Design Challenges of Technology Scaling

IEEE Micro
Exploring the Design Space of Future CMPs

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Power-Sensitive Multithreaded Architecture

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Computer Architecture: A Quantitative Approach

Computer Architecture: A Quantitative Approach
JETTY: Filtering Snoops for Reduced Energy Consumption in SMP Servers

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Leakage Current: Moore's Law Meets Static Power

Computer
Exploiting Processor Workload Heterogeneity for Reducing Energy Consumption in Chip Multiprocessors

Proceedings of the conference on Design, automation and test in Europe - Volume 2
The energy efficiency of CMP vs. SMT for multimedia workloads

Proceedings of the 18th annual international conference on Supercomputing
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance

Proceedings of the 31st annual international symposium on Computer architecture
Understanding the energy efficiency of simultaneous multithreading

Proceedings of the 2004 international symposium on Low power electronics and design
Power-optimal pipelining in deep submicron technology

Proceedings of the 2004 international symposium on Low power electronics and design
Best of Both Latency and Throughput

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Performance, Energy, and Thermal Considerations for SMT and CMP Architectures

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Mitigating Amdahl's Law through EPI Throttling

Proceedings of the 32nd annual international symposium on Computer Architecture
The Thrifty Barrier: Energy-Aware Synchronization in Shared-Memory Multiprocessors

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Computer Architecture: Challenges and Opportunities for the Next Decade

IEEE Micro
On evaluating request-distribution schemes for saving energy in server clusters

ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
Quantitative performance analysis of the SPEC OMPM2001 benchmarks

Scientific Programming - OpenMP
Energy conservation policies for web servers

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Energy-efficient server clusters

PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
Integrated analysis of power and performance for pipelined microprocessors

IEEE Transactions on Computers

Using fine grain multithreading for energy efficient computing

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient online computation of core speeds to maximize the throughput of thermally constrained multi-core processors

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Fast and accurate prediction of the steady-state throughput of multicore processors under thermal constraints

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Conservation cores: reducing the energy of mature computations

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Towards optimizing energy costs of algorithms for shared memory architectures

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Area-efficient floorplans and interconnects for homogeneous multi-core architectures

International Journal of High Performance Systems Architecture
An integrated GPU power and performance model

Proceedings of the 37th annual international symposium on Computer architecture
Overhead-aware energy optimization for real-time streaming applications on multiprocessor System-on-Chip

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Scheduling of stream-based real-time applications for heterogeneous systems

Proceedings of the 2011 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Case studies of multi-core energy efficiency in task based programs

ICT-GLOW'12 Proceedings of the Second international conference on ICT as Key Technology against Global Warming
Computational sprinting on a hardware/software testbed

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Exploring hybrid memory for GPU energy efficiency through software-hardware co-design

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper looks at the power-performance implications of running parallel applications on chip multiprocessors (CMPs). First, we develop an analytical model that, for the first time, puts together parallel efficiency, granularity of parallelism, and voltage/frequency scaling, to establish a formal connection with the power consumption and performance of a parallel code running on a CMP. We then conduct detailed simulations of parallel applications running on a detailed power-performance CMP model to confirm the analytical results and provide further insights. Both analytical and experimental models show that parallel computing can bring significant power savings and still meet a given performance target by choosing granularity and voltage/frequency levels judiciously. The particular choice, however, is dependent on the application's parallel efficiency curve and the process technology utilized, which our model captures. Likewise, analytical model and experiments show the effect of a limited power budget on the application's scalability curve. In particular, we show that a limited power budget can cause a rapid performance degradation beyond a number of cores, even in the case of applications with excellent scalability properties. On the other hand, our experiments show that, when a limited power budget is in place, power-thrifty memory-bound applications may actually enjoy better scalability than more compute-intensive codes, even if the latter would exhibit higher scalability in a power-unconstrained scenario.