Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Comparing Program Phase Detection Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Aide de camp: asymmetric multi-core design for dynamic thermal management
Aide de camp: asymmetric multi-core design for dynamic thermal management
Scheduling Processor Voltage and Frequency in Server and Cluster Systems
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
Vertigo: automatic performance-setting for Linux
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Policies for dynamic clock scheduling
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Energy conservation policies for web servers
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
A hardware architecture for dynamic performance and energy adaptation
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
CPU packing for multiprocessor power reduction
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Heterogeneous Chip Multiprocessors
Computer
Core architecture optimization for heterogeneous chip multiprocessors
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Balancing power consumption in multiprocessor systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
VirtualPower: coordinated power management in virtualized enterprise systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Efficient operating system scheduling for performance-asymmetric multi-core architectures
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Vpm tokens: virtual machine-aware power budgeting in datacenters
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Novel task migration framework on configurable heterogeneous MPSoC platforms
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Low-complexity policies for energy-performance tradeoff in chip-multi-processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
VPM tokens: virtual machine-aware power budgeting in datacenters
Cluster Computing
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Bias scheduling in heterogeneous multi-core architectures
Proceedings of the 5th European conference on Computer systems
Resource-conscious scheduling for energy efficiency on multicore processors
Proceedings of the 5th European conference on Computer systems
Scalable thread scheduling and global power management for heterogeneous many-core architectures
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
AKULA: a toolset for experimenting and developing thread placement algorithms on multicore systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Bridging functional heterogeneity in multicore architectures
ACM SIGOPS Operating Systems Review
Exploring the effects of on-chip thermal variation on high-performance multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO)
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)
Proceedings of the 39th Annual International Symposium on Computer Architecture
Power-efficient time-sensitive mapping in heterogeneous systems
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Understanding fundamental design choices in single-ISA heterogeneous multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Fairness-aware scheduling on single-ISA heterogeneous multi-cores
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Applications on today's high-end systems typically make varying load demands over time. A single application may have many different phases during its lifetime, and workload mixes show interleaved phases. Memory-intensive work or phases may exhibit performance saturation at frequencies below the maximum possible for the processors due to the disparity between processor and memory speeds. Performance saturation is a sign of over-provisioning and leads to energy-inefficient systems. Computers using heterogeneous processors, with the same ISA, but different implementation details, have been proposed as a way of reducing power while avoiding or limiting performance degradation. However, using heterogeneous processors effectively is complicated and requires intelligent schedulingThe research reported here explores the use of a heterogeneous system of processors with identical ISAs and implementation details, but with differing voltages and frequencies. The scheduler uses the execution characteristics of each application to predict its future processing needs and then schedule it to a processor which matches those needs if one is available. The predictions are used to minimize the performance loss to the system as a whole rather than that of a single application. The result limits system power while minimizing total performance loss. A prototype implementation on a Power4 four-processor system is presented. The prototype scheduler is validated using both synthetic and real-world benchmarks. The prototype shows reasonable predictor accuracy and significant power savings for memory-bound applications