A dynamic disk spin-down technique for mobile computing
MobiCom '96 Proceedings of the 2nd annual international conference on Mobile computing and networking
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
System-level power optimization: techniques and tools
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The design and use of simplepower: a cycle-accurate energy estimation tool
Proceedings of the 37th Annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Design issues for dynamic voltage scaling
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Compiler-directed dynamic voltage/frequency scheduling for energy reduction in microprocessors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Adaptive Disk Spin-down Policies for Mobile Computers
MLICS '95 Proceedings of the 2nd Symposium on Mobile and Location-Independent Computing
MPI-2: Extending the Message-Passing Interface
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A performance analysis of the Berkeley UPC compiler
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The Bladed Beowulf: A Cost-Effective Alternative to Traditional Beowulfs
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Power and Energy Profiling of Scientific Applications on Distributed Systems
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Performance Modeling of Subnet Management on Fat Tree InfiniBand Networks using OpenSM
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 18 - Volume 19
Modeling Hard-Disk Power Consumption
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Topology mapping for Blue Gene/L supercomputer
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
The HPC Challenge (HPCC) benchmark suite
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Making scheduling "cool": temperature-aware workload placement in data centers
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
High Performance Distributed Lock Management Services using Network-based Remote Atomic Operations
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications
IEEE Transactions on Parallel and Distributed Systems
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
High Performance MPI over iWARP: Early Experiences
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Bounding energy consumption in large-scale MPI programs
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 22nd annual international conference on Supercomputing
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Prediction models for multi-dimensional power-performance optimization on many cores
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Evaluating high performance communication: a power perspective
Proceedings of the 23rd international conference on Supercomputing
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Energy Profiling and Analysis of the HPC Challenge Benchmarks
International Journal of High Performance Computing Applications
PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
As the march to the exascale computing gains momentum, energy consumption of supercomputers has emerged to be the critical roadblock. While architectural innovations are imperative in achieving computing of this scale, it is largely dependent on the systems software to leverage the architectural innovations. Parallel applications in many computationally intensive domains have been designed to leverage these supercomputers, with legacy two-sided communication semantics using Message Passing Interface. At the same time, Partitioned Global Address Space Models are being designed which provide global address space abstractions and one-sided communication for exploiting data locality and communication optimizations. PGAS models rely on one-sided communication runtime systems for leveraging high-speed networks to achieve best possible performance.In this paper, we present a design for Power Aware One-Sided Communication Llibrary --- PASCoL. The proposed design detects communication slack, leverages Dynamic Voltage and Frequency Scaling (DVFS), and Interrupt driven execution to exploit the detected slack for energy efficiency. We implement our design and evaluate it using synthetic benchmarks for one-sided communication primitives, Put, Get, and Accumulate and uniformly noncontiguous data transfers. Our performance evaluation indicates that we can achieve significant reduction in energy consumption without performance loss on multiple one-sided communication primitives. The achieved results are close to the theoretical peak available with the experimental test bed.