A dynamic disk spin-down technique for mobile computing
MobiCom '96 Proceedings of the 2nd annual international conference on Mobile computing and networking
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
System-level power optimization: techniques and tools
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The design and use of simplepower: a cycle-accurate energy estimation tool
Proceedings of the 37th Annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Design issues for dynamic voltage scaling
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Compiler-directed dynamic voltage/frequency scheduling for energy reduction in microprocessors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Adaptive Disk Spin-down Policies for Mobile Computers
MLICS '95 Proceedings of the 2nd Symposium on Mobile and Location-Independent Computing
MPI-2: Extending the Message-Passing Interface
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The Bladed Beowulf: A Cost-Effective Alternative to Traditional Beowulfs
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Power and Energy Profiling of Scientific Applications on Distributed Systems
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Modeling Hard-Disk Power Consumption
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
DSD '06 Proceedings of the 9th EUROMICRO Conference on Digital System Design
The HPC Challenge (HPCC) benchmark suite
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Making scheduling "cool": temperature-aware workload placement in data centers
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications
IEEE Transactions on Parallel and Distributed Systems
CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Bounding energy consumption in large-scale MPI programs
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 22nd annual international conference on Supercomputing
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Prediction models for multi-dimensional power-performance optimization on many cores
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Evaluating high performance communication: a power perspective
Proceedings of the 23rd international conference on Supercomputing
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Energy Profiling and Analysis of the HPC Challenge Benchmarks
International Journal of High Performance Computing Applications
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
IEEE Transactions on Parallel and Distributed Systems
An efficient kernel-level blocking MPI implementation
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Energy saving strategies for parallel applications with point-to-point communication phases
Journal of Parallel and Distributed Computing
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Hi-index | 0.00 |
The insatiable demand of high performance computing is being driven by the most computationally intensive applications such as computational chemistry, climate modeling, nuclear physics, etc. The last couple of decades have observed a tremendous rise in supercomputers with architectures ranging from traditional clusters to system-on-a-chip in order to achieve the petaflop computing barrier. However, with advent of petaflop-plus computing, we have ushered in an era where power efficient system software stack is imperative for execution on exascale systems and beyond. At the same time, computationally intensive applications are exploring programming models beyond traditional message passing, as a combination of Partitioned Global Address Space (PGAS) languages and libraries, providing one-sided communication paradigm with put, get and accumulate primitives. To support the PGAS models, it is critical to design power efficient and high performance one-sided communication runtime systems. In this paper, we design and implement PASCoL, a high performance power aware one-sided communication library using Aggregate Remote Memory Copy Interface (ARMCI), the communication runtime system of Global Arrays. For various communication primitives provided by ARMCI, we study the impact of Dynamic Voltage/Frequency Scaling (DVFS) and a combination of interrupt (blocking)/polling based mechanisms provided by most modern interconnects. We implement our design and evaluate it with synthetic benchmarks using an Infini Band cluster. Our results indicate that PASCoL can achieve significant reduction in energy consumed per byte transfer without additional penalty for various one-sided communication primitives and various message sizes and data transfer patterns.