Simultaneous multithreading: maximizing on-chip parallelism
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A Characterization of Processor Performance in the vax-11/780
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
The energy efficiency of CMP vs. SMT for multimedia workloads
Proceedings of the 18th annual international conference on Supercomputing
Power provisioning for a warehouse-sized computer
Proceedings of the 34th annual international symposium on Computer architecture
Wake up and smell the coffee: evaluation methodology for the 21st century
Communications of the ACM - Designing games with a purpose
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis
Proceedings of the 37th annual international symposium on Computer architecture
RAPL: memory power estimation and capping
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Dynamic voltage and frequency scaling: the laws of diminishing returns
HotPower'10 Proceedings of the 2010 international conference on Power aware computing and systems
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Toward Dark Silicon in Servers
IEEE Micro
How much (execution) time and energy does my algorithm cost?
XRDS: Crossroads, The ACM Magazine for Students - Scientific Computing
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 48.22 |
The past 10 years have delivered two significant revolutions. (1) Microprocessor design has been transformed by the limits of chip power, wire latency, and Dennard scaling---leading to multicore processors and heterogeneity. (2) Managed languages and an entirely new software landscape emerged---revolutionizing how software is deployed, is sold, and interacts with hardware. Researchers most often examine these changes in isolation. Architects mostly grapple with microarchitecture design through the narrow software context of native sequential SPEC CPU benchmarks, while language researchers mostly consider microarchitecture in terms of performance alone. This work explores the clash of these two revolutions over the past decade by measuring power, performance, energy, and scaling, and considers what the results may mean for the future. Our diverse findings include the following: (a) native sequential workloads do not approximate managed workloads or even native parallel workloads; (b) diverse application power profiles suggest that future applications and system software will need to participate in power optimization and management; and (c) software and hardware researchers need access to real measurements to optimize for power and energy.