Optimal pipelining in supercomputers
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Characterization of branch and data dependencies on programs for evaluating pipeline performance
IEEE Transactions on Computers
Journal of Parallel and Distributed Computing
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
The optimum pipeline depth for a microprocessor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Increasing processor performance by implementing deeper pipelines
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Proceedings of the 2002 international symposium on Low power electronics and design
Deep-Submicron Microprocessor Design Issues
IEEE Micro
Optimizing pipelines for power and performance
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Optimum Power/Performance Pipeline Depth
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Tile size selection for low-power tile-based architectures
Proceedings of the 3rd conference on Computing frontiers
Proximity-aware directory-based coherence for multi-core processor architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Transactions on High-Performance Embedded Architectures and Compilers I
Area-efficiency in CMP core design: co-optimization of microarchitecture and physical design
ACM SIGARCH Computer Architecture News
Circuit techniques for dynamic variation tolerance
Proceedings of the 46th Annual Design Automation Conference
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis
Proceedings of the 37th annual international symposium on Computer architecture
Predicting Performance Impact of DVFS for Realistic Memory Systems
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
The impact of pipeline length on both the power and performance of a microprocessor is explored both by theory and by simulation. A theory is presented for a range of power/performance metrics, BIPSm/W. The theory shows that the more important power is to the metric, the shorter the optimum pipeline length that results. For typical parameters neither BIPS/W nor BIPS2/W yield an optimum, i.e., a non-pipelined design is optimal. For BIPS3/W the optimum, averaged over all 55 workloads studied, occurs at a 22.5 FO4 design point, a 7 stage pipeline, but this value is highly dependent on the assumed growth in latch count with pipeline depth. As dynamic power grows, the optimal design point shifts to shorter pipelines. Clock gating pushes the optimum to deeper pipelines. Surprisingly, as leakage power grows, the optimum is also found to shift to deeper pipelines. The optimum pipeline depth varies for different classes of workloads: SPEC95 and SPEC2000 integer applications, traditional (legacy) database and on-line transaction processing applications, modern (e. g. web) applications, and floating point applications.