Communications of the ACM
Cost-performance analysis of heterogeneity in supercomputer architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Speculative synchronization: applying thread-level speculation to explicitly parallel applications
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Mitigating Amdahl's Law through EPI Throttling
Proceedings of the 32nd annual international symposium on Computer Architecture
The Impact of Performance Asymmetry in Emerging Multicore Architectures
Proceedings of the 32nd annual international symposium on Computer Architecture
Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors
IEEE Computer Architecture Letters
Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Thousand core chips: a technology perspective
Proceedings of the 44th annual Design Automation Conference
Amdahl's law revisited for single chip systems
International Journal of Parallel Programming
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Amdahl's Law in the Multicore Era
Computer
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Accelerating critical section execution with asymmetric multi-core architectures
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Analyzing performance asymmetric multicore processors for latency sensitive datacenter applications
HotPower'10 Proceedings of the 2010 international conference on Power aware computing and systems
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Rapid identification of architectural bottlenecks via precise event counting
Proceedings of the 38th annual international symposium on Computer architecture
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
An analysis of power reduction in datacenters using heterogeneous chip multiprocessors
ACM SIGMETRICS Performance Evaluation Review
Amdahl's law for predicting the future of multicores considered harmful
ACM SIGARCH Computer Architecture News
HaLock: hardware-assisted lock contention detection in multithreaded applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Criticality stacks: identifying critical threads in parallel programs using synchronization behavior
Proceedings of the 40th Annual International Symposium on Computer Architecture
Wimpy or brawny cores: A throughput perspective
Journal of Parallel and Distributed Computing
Amdahl's law in the era of process variation
International Journal of High Performance Systems Architecture
Hi-index | 0.00 |
This paper presents a fundamental law for parallel performance: it shows that parallel performance is not only limited by sequential code (as suggested by Amdahl's law) but is also fundamentally limited by synchronization through critical sections. Extending Amdahl's software model to include critical sections, we derive the surprising result that the impact of critical sections on parallel performance can be modeled as a completely sequential part and a completely parallel part. The sequential part is determined by the probability for entering a critical section and the contention probability (i.e., multiple threads wanting to enter the same critical section). This fundamental result reveals at least three important insights for multicore design. (i) Asymmetric multicore processors deliver less performance benefits relative to symmetric processors than suggested by Amdahl's law, and in some cases even worse performance. (ii) Amdahl's law suggests many tiny cores for optimum performance in asymmetric processors, however, we find that fewer but larger small cores can yield substantially better performance. (iii) Executing critical sections on the big core can yield substantial speedups, however, performance is sensitive to the accuracy of the critical section contention predictor.