The psi-cube: a bus-based cube-type clustering network for high-performance on-chip systems

Authors:
Masaru Takesue
Affiliations:
Department of Electronics and Information Engineering, Hosei University, Tokyo 184-8584, Japan
Venue:
Parallel Computing
Year:
2006

Citing 22
Cited 0

Fat-trees: universal networks for hardware-efficient supercomputing

IEEE Transactions on Computers
A Group-Theoretic Model for Symmetric Interconnection Networks

IEEE Transactions on Computers
The Stanford Dash Multiprocessor

Computer
Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The case for a single-chip multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Linear Recursive Networks and Their Applications in Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
A generic architecture for on-chip packet-switched interconnections

DATE '00 Proceedings of the conference on Design, automation and test in Europe
Route packets, not wires: on-chip inteconnection networks

Proceedings of the 38th annual Design Automation Conference
Will Physical Scalability Sabotage Performance Gains?

Computer
An Interconnect Architecture for Networking Systems on Chips

IEEE Micro
Bused Hypercubes and Other Pin-Optimal Networks

IEEE Transactions on Parallel and Distributed Systems
Virtual-Channel Flow Control

IEEE Transactions on Parallel and Distributed Systems
The Crossed Cube Architecture for Parallel Computation

IEEE Transactions on Parallel and Distributed Systems
A survey of techniques for energy efficient on-chip communication

Proceedings of the 40th annual Design Automation Conference
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture

Proceedings of the 30th annual international symposium on Computer architecture
Programmable Stream Processors

Computer
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling

Proceedings of the 32nd annual international symposium on Computer Architecture
Area - Time - Power and Design effort: the basic tradeoffs in Application Specific Systems

ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a bus-based cube-type network, called psi-cube, that alleviates the two problems, long wires and a limited number of I/O pins, against the on-chip systems through a small diameter and dynamic clusters, respectively. The 2^n-node psi-cube is organized on the sets of node-partitions produced with an extended n-bit Hamming code @j(n,k) [M. Takesue, @J-Cubes: recursive bused fat-hypercubes for multilevel snoopy caches, in: Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Networks, IEEE CS Press, 1999, pp. 62-67] if we connect the nodes in each partition to the bus owned by the leader of the partition. Owing to the routing between the leaders separated by the distance of 1-3, the diameter equals @?n/2@? if n2^p-1 or @?n/2@? otherwise. The maximum bus length is O(2^p^-^1) or O(2^k^-^1) when the psi-cube is mapped onto an array. We dynamically produce separate sets of clusters for different off-chip targets such as memory blocks, so the traffic to the leaders of clusters is much smaller than in static clusters fixed in hardware. From simulation results, the psi-cube outperforms over the mesh if the bus delay is less than 4 times the mesh link's, and the dynamic clusters increase the psi-cube bandwidth by over 60%.