Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Power and CAD considerations for the 1.75mbyte, 1.2ghz L2 cache on the alpha 21364 CPU
Proceedings of the 12th ACM Great Lakes symposium on VLSI
Reducing power density through activity migration
Proceedings of the 2003 international symposium on Low power electronics and design
Characterizing and Predicting Program Behavior and its Variability
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Coordinated, distributed, formal energy management of chip multiprocessors
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Combined circuit and architectural level variable supply-voltage scaling for low power
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing the Latency and Area Cost of Core Swapping through Shared Helper Engines
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Core fusion: accommodating software diversity in chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Analysis of dynamic voltage/frequency scaling in chip-multiprocessors
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Trading off Cache Capacity for Reliability to Enable Low Voltage Operation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Characterizing processor thermal behavior
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Proceedings of the 7th ACM international conference on Computing frontiers
WiDGET: Wisconsin decoupled grid execution tiles
Proceedings of the 37th annual international symposium on Computer architecture
Data marshaling for multi-core architectures
Proceedings of the 37th annual international symposium on Computer architecture
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
Proceedings of the 37th annual international symposium on Computer architecture
Proceedings of the 37th annual international symposium on Computer architecture
Scalable thread scheduling and global power management for heterogeneous many-core architectures
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Proceedings of the international conference on Supercomputing
Deadlock-free fine-grained thread migration
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Scalable power control for many-core architectures running multi-threaded applications
Proceedings of the 38th annual international symposium on Computer architecture
Power profiling and optimization for heterogeneous multi-core systems
ACM SIGARCH Computer Architecture News
System-level application-aware dynamic power management in adaptive pipelined MPSoCs for multimedia
Proceedings of the International Conference on Computer-Aided Design
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Journal of Systems and Software
Pack & Cap: adaptive DVFS and thread packing under power caps
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
PGCapping: exploiting power gating for power capping and core lifetime balancing in CMPs
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Computers and Industrial Engineering
Toward on-chip datacenters: a perspective on general trends and on-chip particulars
The Journal of Supercomputing
Moths: Mobile threads for on-chip networks
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
ACM Transactions on Architecture and Code Optimization (TACO)
SLICC: Self-Assembly of Instruction Cache Collectives for OLTP Workloads
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Composite Cores: Pushing Heterogeneity Into a Core
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Explicit transient thermal simulation of liquid-cooled 3D ICs
Proceedings of the Conference on Design, Automation and Test in Europe
Architecturally homogeneous power-performance heterogeneous multicore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy-efficient virtual machine scheduling in performance-asymmetric multi-core architectures
Proceedings of the 8th International Conference on Network and Service Management
Dynamic power management for multidomain system-on-chip platforms: An optimal control approach
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Trace based phase prediction for tightly-coupled heterogeneous cores
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient work-stealing language runtimes
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Price theory based power management for heterogeneous multi-cores
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
A generalized software framework for accurate and efficient management of performance goals
Proceedings of the Eleventh ACM International Conference on Embedded Software
Dynamic server power capping for enabling data center participation in power markets
Proceedings of the International Conference on Computer-Aided Design
Dynamic Power and Thermal Management of NoC-Based Heterogeneous MPSoCs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 0.00 |
Dynamic voltage and frequency scaling (DVFS) is a commonly-used power-management scheme that dynamically adjusts power and performance to the time-varying needs of running programs. Unfortunately, conventional DVFS, relying on off-chip regulators, faces limitations in terms of temporal granularity and high costs when considered for future multi-core systems. To overcome these challenges, this paper presents thread motion (TM), a fine-grained power-management scheme for chip multiprocessors (CMPs). Instead of incurring the high cost of changing the voltage and frequency of different cores, TM enables rapid movement of threads to adapt the time-varying computing needs of running applications to a mixture of cores with fixed but different power/performance levels. Results show that for the same power budget, two voltage/frequency levels are sufficient to provide performance gains commensurate to idealized scenarios using per-core voltage control. Thread motion extends workload-based power management into the nanosecond realm and, for a given power budget, provides up to 20% better performance than coarse-grained DVFS.