The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
The optimum pipeline depth for a microprocessor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Managing multi-configuration hardware via dynamic working set analysis
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A case for dynamic pipeline scaling
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Optimizing pipelines for power and performance
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Pipeline stage unification: a low-energy consumption technique for future mobile processors
Proceedings of the 2003 international symposium on Low power electronics and design
Comparing Program Phase Detection Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Optimum Power/Performance Pipeline Depth
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Combined circuit and architectural level variable supply-voltage scaling for low power
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Discovering and Exploiting Program Phases
IEEE Micro
A Dynamic Control Mechanism for Pipeline Stage Unification by Identifying Program Phases
IEICE - Transactions on Information and Systems
Hi-index | 0.00 |
Recently, a method known as pipeline stage unification (PSU) has been proposed to alleviate the increasing energy consumption problem in modern microprocessors. PSU achieves a high energy efficiency by employing a changeable pipeline depth and its working scheme is eligible for a fine control method. In this paper, we propose a dynamic method to study fine-grained program interval behaviors based on some easy-to-get runtime processor metrics. Using this method to determine the proper PSU configurations during the program execution, we are able to achieve an averaged 13.5% energydelay-product (EDP) reduction for SPEC CPU2000 integer benchmarks, compared to the baseline processor. This value is only 0.14% larger than the theoretically idealized controlling. Our hardware synthesis result indicates that the proposed method can largely decrease the hardware overhead in both area and delay costs, as compared to a previous program study method which is based on working set signatures.