A Design Methodology for Efficient Application-Specific On-Chip Interconnects

Authors:
Wai Hong Ho;Timothy Mark Pinkston
Affiliations:
IEEE Computer Society;IEEE
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2006

Citing 17
Cited 4

LogP: a practical model of parallel computation

Communications of the ACM
Route packets, not wires: on-chip inteconnection networks

Proceedings of the 38th annual Design Automation Conference
Reconfigurable computing: a survey of systems and software

ACM Computing Surveys (CSUR)
A design space evaluation of grid processor architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cryptography Efficient Permutation Instructions for Fast Software

IEEE Micro
Communication Characteristics of Large-Scale Scientific Applications for Contemporary Cluster Architectures

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Static Communications in Parallel Scientific Propgrams

PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
The Effects of Network Contention on Processor Allocation Strategies

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Methodology for Designing Efficient On-Chip Interconnects on Well-Behaved Communication Patterns

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Wavelengths Requirement for Permutation Routing in All-Optical Multistage Interconnection Networks

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
The Alpha 21364 Network Architecture

HOTI '01 Proceedings of the The Ninth Symposium on High Performance Interconnects
A New Task Mapping Technique for Communication-Aware Scheduling Strategies

ICPPW '01 Proceedings of the 2001 International Conference on Parallel Processing Workshops
Deadlock-Free Dynamic Reconfiguration Schemes for Increased Network Dependability

IEEE Transactions on Parallel and Distributed Systems
Bandwidth-Constrained Mapping of Cores onto NoC Architectures

Proceedings of the conference on Design, automation and test in Europe - Volume 2
Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
MOGAC: a multiobjective genetic algorithm for hardware-software cosynthesis of distributed embedded systems

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A clustering approach for identifying and quantifying irregularities in interconnection networks

IEEE Transactions on Parallel and Distributed Systems

Implementing DSP Algorithms with On-Chip Networks

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
MIMD implementation with PicoBlaze microprocessor using MPI functions

CompSysTech '07 Proceedings of the 2007 international conference on Computer systems and technologies
TransCom: transforming stream communication for load balance and efficiency in networks-on-chip

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Designing best effort networks-on-chip to meet hard latency constraints

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures

Quantified Score

Hi-index	0.01

Visualization

Abstract

As the level of chip integration continues to advance at a fast pace, the desire for efficient interconnects—whether on-chip or off-chip—is rapidly increasing. Traditional interconnects like buses, point-to-point wires, and regular topologies may suffer from poor resource sharing in the time and space domains, leading to high contention or low resource utilization. In this paper, we propose a design methodology for constructing networks for special-purpose computer systems with well-behaved (known) communication characterictics. A temporal and spatial model is proposed to define the sufficient condition for contention-free communication. Based upon this model, a design methodology using a recursive bisection technique is applied to systematically partition a parallel system such that the required number of links and switches is minimized while achieving low contention. Results show that the design methodology can generate more optimized on-chip networks with up to 60 percent fewer resources than meshes or tori while providing blocking performance closer to that of a fully connected crossbar.