A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Process cruise control: event-driven clock scaling for dynamic power management
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Conserving disk energy in network servers
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A control-theoretic approach to dynamic voltage scheduling
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
A First-Order Superscalar Processor Model
Proceedings of the 31st annual international symposium on Computer architecture
Formal online methods for voltage/frequency control in multiple clock domain microprocessors
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Exploiting Barriers to Optimize Power Consumption of CMPs
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Methods for Modeling Resource Contention on Simultaneous Multithreading Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A Power-Aware Run-Time System for High-Performance Computing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Accurate and efficient regression modeling for microarchitectural performance and power prediction
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Online power-performance adaptation of multithreaded programs using hardware event-based prediction
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Balancing power consumption in multiprocessor systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Methods of inference and learning for performance modeling of parallel applications
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Limiting the power consumption of main memory
Proceedings of the 34th annual international symposium on Computer architecture
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes
IEEE Transactions on Parallel and Distributed Systems
WattApp: an application aware power meter for shared data centers
Proceedings of the 7th international conference on Autonomic computing
Towards optimizing energy costs of algorithms for shared memory architectures
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
SoftPower: fine-grain power estimations using performance counters
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Power profiling and optimization for heterogeneous multi-core systems
ACM SIGARCH Computer Architecture News
Critical path-based thread placement for NUMA systems
Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Hierarchical parallel approach in vascular network modeling: hybrid MPI+OpenMP implementation
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Critical path-based thread placement for NUMA systems
ACM SIGMETRICS Performance Evaluation Review
Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure
Journal of Grid Computing
Designing energy efficient communication runtime systems: a view from PGAS models
The Journal of Supercomputing
Predicting Performance Impact of DVFS for Realistic Memory Systems
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Exploring hardware overprovisioning in power-constrained, high performance computing
Proceedings of the 27th international ACM conference on International conference on supercomputing
Holistic run-time parallelism management for time and energy efficiency
Proceedings of the 27th international ACM conference on International conference on supercomputing
On understanding the energy consumption of ARM-based multicore servers
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Energy saving strategies for parallel applications with point-to-point communication phases
Journal of Parallel and Distributed Computing
Coordinated energy management in heterogeneous processors
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Hi-index | 0.00 |
Power has become a primary concern for HPC systems. Dynamic voltage and frequency scaling (DVFS) and dynamic concurrency throttling (DCT) are two software tools (or knobs) for reducing the dynamic power consumption of HPC systems. To date, few works have considered the synergistic integration of DVFS and DCT in performance-constrained systems, and, to the best of our knowledge, no prior research has developed application-aware simultaneous DVFS and DCT controllers in real systems and parallel programming frameworks. We present a multi-dimensional, online performance predictor, which we deploy to address the problem of simultaneous runtime optimization of DVFS and DCT on multi-core systems. We present results from an implementation of the predictor in a runtime library linked to the Intel OpenMP environment and running on an actual dual-processor quad-core system. We show that our predictor derives near-optimal settings of the power-aware program adaptation knobs that we consider. Our overall framework achieves significant reductions in energy (19% mean) and ED2 (40% mean), through simultaneous power savings (6% mean) and performance improvements (14% mean). We also find that our framework outperforms earlier solutions that adapt only DVFS or DCT, as well as one that sequentially applies DCT then DVFS. Further, our results indicate that prediction-based schemes for runtime adaptation compare favorably and typically improve upon heuristic search-based approaches in both performance and energy savings.