Benchmark Synthesis Using the LRU Cache Hit Function
IEEE Transactions on Computers
Power estimation in sequential circuits
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Architectural mechanisms for explicit communication in shared memory multiprocessors
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Compiler-directed shared-memory communication for iterative parallel applications
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Control Techniques to Eliminate Voltage Emergencies in High Performance Processors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Improved automatic testcase synthesis for performance model validation
Proceedings of the 19th annual international conference on Supercomputing
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
DRAMsim: a memory system simulator
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Distilling the essence of proprietary workloads into miniature benchmarks
ACM Transactions on Architecture and Code Optimization (TACO)
A Performance Counter Based Workload Characterization on Blue Gene/P
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Design Exploration of Optical Interconnection Networks for Chip Multiprocessors
HOTI '08 Proceedings of the 2008 16th IEEE Symposium on High Performance Interconnects
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Improving support for locality and fine-grain sharing in chip multiprocessors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Performance projection of HPC applications using SPEC CFP2006 benchmarks
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A communication characterisation of Splash-2 and Parsec
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Microprocessor power estimation using profile-driven program synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
The practically attainable worst case power consumption for a computer system is a significant design parameter and it is a very tedious process to determine it by manually writing high power consuming code snippets called power viruses. Previous research efforts towards automating the power virus generation process are all limited to the single core processors and are not effective when applied to multicore parallel systems as the components like the interconnection network, shared caches, DRAM and coherence directory also contribute significantly to the power consumption of a multicore parallel system. In this paper we propose MAximum Multicore POwer (MAMPO), which is the pioneer attempt towards a framework to automatically generate a multithreaded power virus for a given multicore parallel system configuration. We show that the the power viruses generated by MAMPO consume 40% to 89% more power than running multiple copies of single-core power viruses like MPrime torture test and the most recent published previous work called SYMPO on 3 different parallel multicore system configurations. The superiority of the MAMPO viruses are also shown by comparing the power consumption of the MAMPO viruses with that of the workloads in the PARSEC benchmark suite and that of the commercial Java benchmark SPECjbb. The MAMPO viruses consume 45% to 98% more power than that of the average power consumption of the workloads in the PARSEC suite and 41% to 56% more power than that of the commercial benchmark SPECjbb.