Hierarchical Interconnects for On-Chip Clustering

Authors:
Aneesh Aggarwal;Manoj Franklin
Affiliations:
-;-
Venue:
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Year:
2002

Citing 11
Cited 7

Trace cache: a low latency approach to high bandwidth instruction fetching

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
Trace processors

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Exploiting idle floating-point resources for integer execution

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
An empirical study of decentralized ILP execution models

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Dynamic code partitioning for clustered architectures

International Journal of Parallel Programming - parallel architectures and compilation techniques, part II
Advanced Computer Architecture: Parallelism,Scalability,Programmability

Advanced Computer Architecture: Parallelism,Scalability,Programmability
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
The Alpha 21264: A 500 MHz Out-of-Order Execution Microprocessor

COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
CARS: A New Code Generation Framework for Clustered ILP Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture

Cluster prefetch: tolerating on-chip wire delays in clustered microarchitectures

Proceedings of the 18th annual international conference on Supercomputing
On-Chip Interconnects and Instruction Steering Schemes for Clustered Microarchitectures

IEEE Transactions on Parallel and Distributed Systems
Package level interconnect options

Proceedings of the 2005 international workshop on System level interconnect prediction
Constant Impedance Scaling Paradigm for Scaling LC transmission lines

ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
A survey of research and practices of Network-on-chip

ACM Computing Surveys (CSUR)
On Characterizing Performance of the Cell Broadband Engine Element Interconnect Bus

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Trends toward on-chip networked microsystems

International Journal of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the sub-micron technology era, wire delays are becoming much more important than gate delays, making it particularly attractive to go for clustered designs. A common form of clustering adopted in processors is to replace the centralized instruction scheduler with multiple smaller schedulers that work in parallel within a single chip. Studies have found that existing interconnects connecting onchip clusters, as well as proposed instruction distribution algorithms, are not scalable. The objective of this paper is to investigate alternate interconnects (we investigate hierarchical interconnects) that provide scalable performance with increase in on-chip clusters. We also investigate distribution algorithms that are best suited for these interconnects. Experimental results of these new interconnects with appropriate distribution techniques show that they more scalable than the existing techniques. achieve an IPC that is around 15-20% more than the most scalable existing configuration, and is also within 2% of that achieved by a hypothetical ideal processor having a 1-cycle latency crossbar interconnect, irrespective of the number of clusters; confirming their utility and applicability In this paper, we also discuss the many other design advantages that are obtained by the use of hierarchical interconnects.