The NAS parallel benchmarks—summary and preliminary results
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
General atomic and molecular electronic structure system
Journal of Computational Chemistry
Using multiple energy gears in MPI programs on a power-scalable cluster
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient behavior-driven runtime dynamic voltage scaling policies
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Evaluating high performance communication: a power perspective
Proceedings of the 23rd international conference on Supercomputing
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters
ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Dynamic Frequency Scaling and Energy Saving in Quantum Chemistry Applications
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Energy saving strategies for parallel applications with point-to-point communication phases
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
With the increase in the peak performance of modern computing platforms, their energy consumption grows as well, which may lead to overwhelming operating costs and failure rates. Techniques, such as Dynamic Voltage and Frequency Scaling (called DVFS) and CPU Clock Modulation (called throttling) are often used to reduce the power consumption of the compute nodes. However, these techniques should be used judiciously during the application execution to avoid significant performance losses. In this work, two implementations of the all-to-all collective operations are studied as to their augmentation with energy saving strategies on the per-call basis. Experiments were performed on the OSU MPI benchmarks as well as on a few real-world problems from the CPMD and NAS suits, in which energy consumption was reduced by up to 10% and 15.7%, respectively, with little performance degradation.