Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
An O(log N) deterministic packet-routing scheme
Journal of the ACM (JACM)
ICS '90 Proceedings of the 4th international conference on Supercomputing
The cube-connected cycles: a versatile network for parallel computation
Communications of the ACM
IEEE Transactions on Parallel and Distributed Systems
Banyan networks for partitioning multiprocessor systems
ISCA '73 Proceedings of the 1st annual symposium on Computer architecture
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Building the 4 Processor SB-PRAM Prototype
HICSS '97 Proceedings of the 30th Hawaii International Conference on System Sciences: Advanced Technology Track - Volume 5
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Design of FPGA interconnect for multilevel metallization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Mesh-of-Trees Interconnection Network for Single-Chip Parallel Processing
ASAP '06 Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors
HOTI '07 Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
The Performance of Multistage Interconnection Networks for Multiprocessors
IEEE Transactions on Computers
Fpga-based prototype of a pram-on-chip processor
Proceedings of the 5th conference on Computing frontiers
New lower bound techniques for VLSI
SFCS '81 Proceedings of the 22nd Annual Symposium on Foundations of Computer Science
A Layout-Aware Analysis of Networks-on-Chip and Traditional Interconnects for MPSoCs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fpga-based prototype of a pram-on-chip processor
Proceedings of the 5th conference on Computing frontiers
Mesh-of-trees and alternative interconnection networks for single-chip parallelism
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
The Journal of Supercomputing
A low-latency adaptive asynchronous interconnection network using bi-modal router nodes
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Better speedups using simpler parallel programming for graph connectivity and biconnectivity
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Hi-index | 0.00 |
Single-chip parallel processing requires high bandwidth between processors and on-chip memory modules. A recently proposed Mesh-of-Trees (MoT) network provides high throughput and low latency at relatively high area cost. In this paper, we introduce a hybrid MoT-BF network that combines MoT network with the area efficient butterfly network. We prove that the hybrid network reduces MoT network's area cost. Cycle-accurate simulation and post-layout results all show that significant area reduction can be achieved with negligible performance degradation, when operating at same clock rate.