SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Theory of Fault-Tolerant Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Design issues for dynamic voltage scaling
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Energy-conscious compilation based on voltage scaling
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
An adaptive low-power transmission scheme for on-chip networks
Proceedings of the 15th international symposium on System Synthesis
The Alpha 21364 Network Architecture
IEEE Micro
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compile-time dynamic voltage scaling settings: opportunities and limits
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Power constrained design of multiprocessor interconnection networks
ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Energy characterization of a tiled architecture processor with on-chip networks
Proceedings of the 2003 international symposium on Low power electronics and design
Energy optimization techniques in cluster interconnects
Proceedings of the 2003 international symposium on Low power electronics and design
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Exploiting Processor Workload Heterogeneity for Reducing Energy Consumption in Chip Multiprocessors
Proceedings of the conference on Design, automation and test in Europe - Volume 2
×pipesCompiler: A Tool for Instantiating Application Specific Networks on Chip
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
High-level power analysis for on-chip networks
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Design-Space Exploration of Power-Aware On/Off Interconnection Networks
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Exploring the Design Space of Power-Aware Opto-Electronic Networked Systems
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Theory, Volume 1, Queueing Systems
Theory, Volume 1, Queueing Systems
Comparing Adaptive Routing and Dynamic Voltage Scaling for Link Power Reduction
IEEE Computer Architecture Letters
Reducing NoC energy consumption through compiler-directed channel voltage scaling
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
High-level power analysis for multi-core chips
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Software-directed power-aware interconnection networks
ACM Transactions on Architecture and Code Optimization (TACO)
Software-directed combined cpu/link voltage scaling fornoc-based cmps
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Application mapping for chip multiprocessors
Proceedings of the 45th annual Design Automation Conference
Dynamic power saving in fat-tree interconnection networks using on/off links
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A survey on techniques for improving the energy efficiency of large-scale distributed systems
ACM Computing Surveys (CSUR)
Hi-index | 0.01 |
Interconnection networks have been deployed as the communication fabric in a wide range of parallel computer systems. With recent technological trends allowing growing quantities of chip resources and faster clock rates, there have been prevailing concerns of increasing power consumption being a major limiting factor in the design of parallel computer systems, from multiprocessor SoCs to multi-chip embedded systems and parallel servers. To tackle this, power-aware networks must become inherent components of single-chip and multi-chip system.On the hardware design side, while there has been some recent interconnection network power reduction research, especially targeted towards communication links, the techniques presented are ad hoc and are not tailored to the application running on the network. We show that with these ad hoc techniques, power savings and corresponding impact on network latency vary significantly from one application to the next -- in many cases network performance can suffer severely. On the software side, extensive research on compile-time optimization has produced parallelizing compilers that can efficiently map an application onto hardware for high performance. However, research into power-aware parallelizing compilers is in its infancy; none addressed communication power.In this paper, we take the first steps towards tailoring applications' communication needs at run-time for low power. We propose software techniques that extend the flow of a parallelizing compiler in order to direct run-time network power optimization. We target network links, the dominant power consumer in these systems, allowing DVS instructions extracted during static compilation to orchestrate link voltage and frequency transitions for power savings during application runtime. Concurrently, a hardware online mechanism measures network congestion levels and adapts these off-line DVS settings to optimize network performance. Our simulations show that link power consumption can be greatly reduced by up to 76.3%, with a minor increase in network latency in the range of 0.23 to 6.78% across a number of benchmark suites running on three existing parallel architectures, from very fine-grained single-chip to coarse-grained multi-chip architectures.