Technology-based Architectural Analysis of Operand Bypass Networks for Efficient Operand Transport

Authors:
Hongkyu Kim;D. Scott Wills;Linda M. Wills
Affiliations:
Georgia Institute of Technology;Georgia Institute of Technology;Georgia Institute of Technology
Venue:
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Year:
2005

Citing 13
Cited 0

Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Automatic detection of recurring operation patterns

CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
TTAs: missing the ILP complexity wall

Journal of Systems Architecture: the EUROMICRO Journal - Special double issue on microprocessor architecture
An instruction set and microarchitecture for instruction level distributed processing

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A design space evaluation of grid processor architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Interconnect Opportunities for Gigascale Integration

IEEE Micro
Modeling technology impact on cluster microprocessor performance

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Profile-guided microarchitectural floorplanning for deep submicron processor design

Proceedings of the 41st annual Design Automation Conference
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams

Proceedings of the 31st annual international symposium on Computer architecture
Empirical Analysis of Operand Usage and Transport in Multimedia Applications

IWSOC '04 Proceedings of the System-on-Chip for Real-Time Applications, 4th IEEE International Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

As semiconductor feature sizes decrease, interconnect delay is becoming a dominant component of processor cycle times. This creates a critical need to shift micro-architectural design focus from operation computation to operand transport. Operand bypass networks of out-of-order superscalar processors are particularly demanding of wiring resources. Forwarding path delay has become a limiting factor of processor performance. This paper proposes a novel technology-based methodology to evaluate bypass network configurations by predicting operand transport cost. It combines technology modeling techniques with cycle-accurate simulation of benchmark applications to characterize operand movement and storage requirements. Our analysis shows that the operand transport cost heavily depends on the physical location of functional units (FUs) and instruction steering strategy. We propose a traffic-based placement which places FUs based on the transport distribution pattern; and a geometry-driven instruction steering which tries to assign each pair of dependent instructionsto adjacent computing resources. Performance is evaluated on an aggressive eight-way, 16 functional unit processor operating at 1.9 GHz in 100 nm technology. Combining these two techniques, the IPC penalties resulting from wire delay latency can be kept within 6.8% of the ideal zero bypass delay processor for Spec2000Int and within 5.5% for MediaBench.