ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
UPC performance and potential: a NPB experimental study
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A performance analysis of the Berkeley UPC compiler
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Managing server energy and operational costs in hosting centers
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Power-Aware Run-Time System for High-Performance Computing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Thousand core chips: a technology perspective
Proceedings of the 44th annual Design Automation Conference
Eon: a language and runtime system for perpetual systems
Proceedings of the 5th international conference on Embedded networked sensor systems
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Adagio: making DVS practical for complex HPC applications
Proceedings of the 23rd international conference on Supercomputing
Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hybrid parallel programming with MPI and unified parallel C
Proceedings of the 7th ACM international conference on Computing frontiers
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A survey of the research on power management techniques for high-performance systems
Software—Practice & Experience
The 48-core SCC Processor: the Programmer's View
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
International Journal of High Performance Computing Applications
A programming model performance study using the NAS parallel benchmarks
Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
RCKMPI - lightweight MPI implementation for intel's single-chip cloud computer (SCC)
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Performance tuning of SCC-MPICH by means of the proposed MPI-3.0 tool interface
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Experiences with UPC on TILE-64 processor
AERO '11 Proceedings of the 2011 IEEE Aerospace Conference
Performance Analysis and Benchmarking of the Intel SCC
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
X10 on the single-chip cloud computer: porting and preliminary performance
Proceedings of the 2011 ACM SIGPLAN X10 Workshop
Exploring power behaviors and trade-offs of in-situ data analytics
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
High-performance parallel computing architectures are increasingly based on multi-core processors. While current commercially available processors are at 8 and 16 cores, technological and power constraints are limiting the performance growth of the cores and are resulting in architectures with much higher core counts, such as the experimental many-core Intel Single-chip Cloud Computer (SCC) platform. These trends are presenting new sets of challenges to HPC applications including programming complexity and the need for extreme energy efficiency. In this paper, we first investigate the power behavior of scientific Partitioned Global Address Space (PGAS) application kernels on the SCC platform, and explore opportunities and challenges for power management within the PGAS framework. Results obtained via empirical evaluation of Unified Parallel C (UPC) applications on the SCC platform under different constraints, show that, for specific operations, the potential for energy savings in PGAS is large; and power/performance trade-offs can be effectively managed using a cross-layer approach. We investigate cross-layer power management using PGAS language extensions and runtime mechanisms that manipulate power/performance tradeoffs. Specifically, we present the design, implementation and evaluation of such a middleware for application-aware cross-layer power management of UPC applications on the SCC platform. Finally, based on our observations, we provide a set of insights that can be used to support similar power management for PGAS applications on other many-core platforms.