Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A characterization of processor performance in the VAX-11/780
25 years of the international symposia on Computer architecture (selected papers)
A Characterization of Processor Performance in the vax-11/780
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
The energy efficiency of CMP vs. SMT for multimedia workloads
Proceedings of the 18th annual international conference on Supercomputing
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Power provisioning for a warehouse-sized computer
Proceedings of the 34th annual international symposium on Computer architecture
Statistically rigorous java performance evaluation
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Analysis of dynamic power management on multi-core processors
Proceedings of the 22nd annual international conference on Supercomputing
Wake up and smell the coffee: evaluation methodology for the 21st century
Communications of the ACM - Designing games with a purpose
Over-provisioned multicore systems
Over-provisioned multicore systems
Evaluation of the Intel® Core i7 Turbo Boost feature
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis
Proceedings of the 37th annual international symposium on Computer architecture
Dynamic voltage and frequency scaling: the laws of diminishing returns
HotPower'10 Proceedings of the 2010 international conference on Power aware computing and systems
Looking back on the language and hardware revolutions: measured power, performance, and scaling
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Looking back on the language and hardware revolutions: measured power, performance, and scaling
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
REEact: a customizable virtual execution manager for multicore platforms
VEE '12 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
Power Limitations and Dark Silicon Challenge the Future of Multicore
ACM Transactions on Computer Systems (TOCS)
The yin and yang of power and performance for asymmetric hardware and managed software
Proceedings of the 39th Annual International Symposium on Computer Architecture
Exploring multi-threaded Java application performance on multicore hardware
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A black-box approach to understanding concurrency in DaCapo
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Exploring single and multilevel JIT compilation policy for modern machines 1
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
This paper reports and analyzes measured chip power and performance on five process technology generations executing 61 diverse benchmarks with a rigorous methodology. We measure representative Intel IA32 processors with technologies ranging from 130nm to 32nm while they execute sequential and parallel benchmarks written in native and managed languages. During this period, hardware and software changed substantially: (1) hardware vendors delivered chip multiprocessors instead of uniprocessors, and independently (2) software developers increasingly chose managed languages instead of native languages. This quantitative data reveals the extent of some known and previously unobserved hardware and software trends. Two themes emerge. (I) Workload: The power, performance, and energy trends of native workloads do not approximate managed workloads. For example, (a) the SPEC CPU2006 native benchmarks on the i7 (45) and i5 (32) draw significantly less power than managed or scalable native benchmarks; and (b) managed runtimes exploit parallelism even when running single-threaded applications. The results recommend architects always include native and managed workloads when designing and evaluating energy efficient hardware. (II) Architecture: Clock scaling, microarchitecture, simultaneous multithreading, and chip multiprocessors each elicit a huge variety of power, performance, and energy responses. This variety and the difficulty of obtaining power measurements recommends exposing on-chip power meters and when possible structure specific power meters for cores, caches, and other structures. Just as hardware event counters provide a quantitative grounding for performance innovations, power meters are necessary for optimizing energy.