Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Composable Lightweight Processors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Amdahl's Law in the Multicore Era
Computer
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Accelerating critical section execution with asymmetric multi-core architectures
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Over-provisioned multicore systems
Over-provisioned multicore systems
Many-Core vs. Many-Thread Machines: Stay Away From the Valley
IEEE Computer Architecture Letters
Understanding PARSEC performance on contemporary CMPs
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Conservation cores: reducing the energy of mature computations
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis
Proceedings of the 37th annual international symposium on Computer architecture
Understanding sources of inefficiency in general-purpose chips
Proceedings of the 37th annual international symposium on Computer architecture
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Toward Dark Silicon in Servers
IEEE Micro
The autonomic operating system research project: achievements and future directions
Proceedings of the 50th Annual Design Automation Conference
Power gating applied to MP-SoCs for standby-mode power management
Proceedings of the 50th Annual Design Automation Conference
Proceedings of the Seventh Workshop on Programming Languages and Operating Systems
Towards software performance engineering for multicore and manycore systems
ACM SIGMETRICS Performance Evaluation Review
Hi-index | 48.22 |
Starting in 2004, the microprocessor industry has shifted to multicore scaling---increasing the number of cores per die each generation---as its principal strategy for continuing performance growth. Many in the research community believe that this exponential core scaling will continue into the hundreds or thousands of cores per chip, auguring a parallelism revolution in hardware or software. However, while transistor count increases continue at traditional Moore's Law rates, the per-transistor speed and energy efficiency improvements have slowed dramatically. Under these conditions, more cores are only possible if the cores are slower, simpler, or less utilized with each additional technology generation. This paper brings together transistor technology, processor core, and application models to understand whether multicore scaling can sustain the historical exponential performance growth in this energy-limited era. As the number of cores increases, power constraints may prevent powering of all cores at their full speed, requiring a fraction of the cores to be powered off at all times. According to our models, the fraction of these chips that is "dark" may be as much as 50% within three process generations. The low utility of this "dark silicon" may prevent both scaling to higher core counts and ultimately the economic viability of continued silicon scaling. Our results show that core count scaling provides much less performance gain than conventional wisdom suggests. Under (highly) optimistic scaling assumptions---for parallel workloads---multicore scaling provides a 7.9× (23% per year) over ten years. Under more conservative (realistic) assumptions, multicore scaling provides a total performance gain of 3.7× (14% per year) over ten years, and obviously less when sufficiently parallel workloads are unavailable. Without a breakthrough in process technology or microarchitecture, other directions are needed to continue the historical rate of performance improvement.