Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Voltage scheduling problem for dynamically variable voltage processors
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Performance, Energy, and Thermal Considerations for SMT and CMP Architectures
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Improved automatic testcase synthesis for performance model validation
Proceedings of the 19th annual international conference on Supercomputing
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Improving fairness, throughput and energy-efficiency on a chip multiprocessor through DVFS
ACM SIGARCH Computer Architecture News
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Implementing Signatures for Transactional Memory
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive transaction scheduling for transactional memory systems
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
SBAC-PAD '08 Proceedings of the 2008 20th International Symposium on Computer Architecture and High Performance Computing
Notary: Hardware techniques to enhance signatures
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Thread motion: fine-grained power management for multi-core systems
Proceedings of the 36th annual international symposium on Computer architecture
Clock gate on abort: Towards energy-efficient hardware Transactional Memory
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
On the (dis)similarity of transactional memory workloads
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Energy and throughput efficient transactional memory for embedded multicore systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
Power has emerged as a first-order design constraint in modern processors and has energized microarchitecture researchers to produce a growing number of power optimization proposals. Almost in tandem with the move toward more energy-efficient designs, architects have been increasing the number of processing elements (PEs) on a single chip and promoting the concept of running multithreaded workloads. Nevertheless, software is still lagging behind and is often unable to exploit these additional resources -- giving rise to transactional memory. Transactional memory is a promising programming abstraction that makes it easier for programmers to exploit the resources available in many- core processor systems by removing some of the complexity associated with traditional lock-based programming. This paper proposes new techniques to merge the power and transactional memory domains. An analysis of the per-core and chip-wide power consumption of hardware transactional memory systems (HTMs) pinpoints two areas ripe for power management policies: transactional stalls and aborts. The first proposed policy uses dynamic voltage and frequency scaling (DVFS) during transactional stall periods. By frequency scaling PEs based on their transactional state, DVFS can increase the throughput and energy efficiency of HTMs. The second method uses a transaction's conflict probability to reschedule transactions and clock gate aborted PEs to reduce overall contention and power consumption within the system. The proposed techniques are evaluated using three HTM configurations and are shown to reduce the energy delay squared product (ED2P) of the STAMP and SPLASH-2 benchmarks by an average of 18% when combined. Synthetic workloads are used to explore a wider range of program behaviors and the optimizations are shown to reduce the ED2P by an average of 29%. For a comparison, this work is shown reduce the ED2P by up to 30% relative to previous proposals for energy reduction in HTMs (e.g. transaction serialization).