ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Robust interfaces for mixed-timing systems with application to latency-insensitive protocols
Proceedings of the 38th annual Design Automation Conference
Improving dynamic voltage scaling algorithms with PACE
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Power and performance evaluation of globally asynchronous locally synchronous processors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Energy efficient CMOS microprocessor design
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Front-End Policies for Improved Issue Efficiency in SMT Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Soft Real- Time Scheduling on Simultaneous Multithreaded Processors
RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
Proceedings of the 31st annual international symposium on Computer architecture
Frontend Frequency-Voltage Adaptation for Optimal Energy-Delay^2
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Dynamically Controlled Resource Allocation in SMT Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Voltage and Frequency Control With Adaptive Reaction Time in Multiple-Clock-Domain Processors
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Exploiting Barriers to Optimize Power Consumption of CMPs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
The Impact of Performance Asymmetry in Emerging Multicore Architectures
Proceedings of the 32nd annual international symposium on Computer Architecture
The Thrifty Barrier: Energy-Aware Synchronization in Shared-Memory Multiprocessors
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Thread-Sensitive Instruction Issue for SMT Processors
IEEE Computer Architecture Letters
Independent front-end and back-end dynamic voltage scaling for a GALS microarchitecture
Proceedings of the 2006 international symposium on Low power electronics and design
Anomalous Behavior of Synchronizer and Arbiter Circuits
IEEE Transactions on Computers
Meeting points: using thread criticality to adapt multicore hardware to parallel regions
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Performance characteristics of OpenMP language constructs on a many-core-on-a-chip architecture
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Understanding the Thermal Implications of Multi-Core Architectures
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
We provide an analysis of thread-management techniques that increase performance or reduce energy in multicore and Simultaneous Multithreaded (SMT) cores. Thread delaying reduces energy consumption by running the core containing the critical thread at maximum frequency while scaling down the frequency and voltage of the cores containing noncritical threads. In this article, we provide an insightful breakdown of thread delaying on a simulated multi-core microprocessor. Thread balancing improves overall performance by giving higher priority to the critical thread in the issue queue of an SMT core. We provide a detailed breakdown of performance results for thread-balancing, identifying performance benefits and limitations. For those benchmarks where a performance benefit is not possible, we introduce a novel thread-balancing mechanism on an SMT core that can reduce energy consumption. We have performed a detailed study on an Intel microprocessor simulator running parallel applications. Thread delaying can reduce energy consumption by 4% to 44% with negligible performance loss. Thread balancing can increase performance by 20% or can reduce energy consumption by 23%.