Looking back on the language and hardware revolutions: measured power, performance, and scaling

Authors:
Hadi Esmaeilzadeh;Ting Cao;Yang Xi;Stephen M. Blackburn;Kathryn S. McKinley
Affiliations:
The University of Washington, Seattle, WA, USA;Australian National University, Canberra, Australia;Australian National University, Canberra, Australia;Australian National University, Canberra, Australia;The University of Texas at Austin, Austin, TX, USA
Venue:
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Year:
2011

Citing 16
Cited 8

Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A characterization of processor performance in the VAX-11/780

25 years of the international symposia on Computer architecture (selected papers)
A Characterization of Processor Performance in the vax-11/780

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
The energy efficiency of CMP vs. SMT for multimedia workloads

Proceedings of the 18th annual international conference on Supercomputing
The DaCapo benchmarks: java benchmarking development and analysis

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
SPEC CPU2006 benchmark descriptions

ACM SIGARCH Computer Architecture News
Power provisioning for a warehouse-sized computer

Proceedings of the 34th annual international symposium on Computer architecture
Statistically rigorous java performance evaluation

Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Analysis of dynamic power management on multi-core processors

Proceedings of the 22nd annual international conference on Supercomputing
Wake up and smell the coffee: evaluation methodology for the 21st century

Communications of the ACM - Designing games with a purpose
Over-provisioned multicore systems

Over-provisioned multicore systems
Evaluation of the Intel® Core i7 Turbo Boost feature

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis

Proceedings of the 37th annual international symposium on Computer architecture
Dynamic voltage and frequency scaling: the laws of diminishing returns

HotPower'10 Proceedings of the 2010 international conference on Power aware computing and systems
Looking back on the language and hardware revolutions: measured power, performance, and scaling

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems

Looking back on the language and hardware revolutions: measured power, performance, and scaling

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
REEact: a customizable virtual execution manager for multicore platforms

VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Power Limitations and Dark Silicon Challenge the Future of Multicore

ACM Transactions on Computer Systems (TOCS)
The yin and yang of power and performance for asymmetric hardware and managed software

Proceedings of the 39th Annual International Symposium on Computer Architecture
Exploring multi-threaded Java application performance on multicore hardware

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A black-box approach to understanding concurrency in DaCapo

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Exploring single and multilevel JIT compilation policy for modern machines 1

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports and analyzes measured chip power and performance on five process technology generations executing 61 diverse benchmarks with a rigorous methodology. We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages. During this period, hardware and software changed substantially: (1) hardware vendors delivered chip multiprocessors instead of uniprocessors, and independently (2) software developers increasingly chose managed languages instead of native languages. This quantitative data reveals the extent of some known and previously unobserved hardware and software trends. Two themes emerge. (I) Workload: The power, performance, and energy trends of native workloads do not approximate managed workloads. For example, (a) the SPEC CPU2006 native benchmarks on the i7 (45) and i5 (32) draw significantly less power than managed or scalable native benchmarks; and (b) managed runtimes exploit parallelism even when running single-threaded applications. The results recommend architects always include native and managed workloads when designing and evaluating energy efficient hardware. (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses. This variety and the difficulty of obtaining power measurements recommends exposing on-chip power meters and when possible structure specific power meters for cores, caches, and other structures. Just as hardware event counters provide a quantitative grounding for performance innovations, power meters are necessary for optimizing energy.