A VLSI architecture for concurrent data structures
A VLSI architecture for concurrent data structures
High-performance computer architecture
High-performance computer architecture
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Warp: an integrated solution of high-speed parallel computing
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Performance Analysis of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
Performance analysis of the connection machine
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
An analytic model of multistage interconnection networks
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The Stanford Dash Multiprocessor
Computer
The J-machine multicomputer: an architectural evaluation
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Linearizer: a heuristic algorithm for queueing network models of computing systems
Communications of the ACM
Limits on Interconnection Network Performance
IEEE Transactions on Parallel and Distributed Systems
On characterizing bandwidth requirements of parallel applications
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Petri net modeling of interconnection networks for massively parallel architectures
ICS '95 Proceedings of the 9th international conference on Supercomputing
Evaluating virtual channels for cache-coherent shared-memory multiprocessors
ICS '96 Proceedings of the 10th international conference on Supercomputing
The Case for Chaotic Adaptive Routing
IEEE Transactions on Computers
A General Theory for Deadlock Avoidance in Wormhole-Routed Networks
IEEE Transactions on Parallel and Distributed Systems
Wormhole routing techniques for directly connected multicomputer systems
ACM Computing Surveys (CSUR)
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements
IEEE Transactions on Parallel and Distributed Systems
Performance Modeling of ServerNet™ SAN Topologies
The Journal of Supercomputing
AMVA techniques for high service time variability
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Minimal adaptive routing with limited injection on Toroidal k-ary n-cubes
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Modeling of interconnection subsystems for massively parallel computers
Performance Evaluation
Balancing Buffer Utilization in Meshes Using a 'Restricted Area' Concept
IEEE Transactions on Parallel and Distributed Systems
Tandem Computers Incorporated: Performance Modeling of ServerNetTM Topologies
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Hybrid Time Synchronization Implemented Through Special Ring Array for Mesh or Torus
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
An Accurate Model for the Performance Analysis of Deterministic Wormhole Routing
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Maximum Delivery Time and Hot Spots in ServerNet(tm) Topologies
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Bidirectional versus Unidirectional Networks: Cost/Performance Trade-Offs
MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Performance Analysis of Wormhole Switching with Adaptive Routing in a Two-Dimensional Torus
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Mean Value Analysis: a Personal Account
Performance Evaluation: Origins and Directions
Analysis of k-ary n-cubes with dimension-ordered routing
Future Generation Computer Systems - Selected papers from CCGRID 2002
Fault-tolerant adaptive routing for two-dimensional meshes
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Abstracting network characteristics and locality properties of parallel systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Measurement and Modeling of EARTH-MANNA Multithreaded Architecture
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Analysis of Buffer Design for Adaptive Routing in Direct Networks
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
A queueing model for wormhole routing with timeout
ICCCN '95 Proceedings of the 4th International Conference on Computer Communications and Networks
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
High-level power analysis for on-chip networks
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
The Effect of Virtual Channel Organization on the Performance of Interconnection Networks
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
Performance analysis of a QoS capable cluster interconnect
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Compiler-directed channel allocation for saving power in on-chip networks
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Application-specific buffer space allocation for networks-on-chip router design
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Stochastic Analysis of Deterministic Routing Algorithms in the Presence of Self-Similar Traffic
The Journal of Supercomputing
Interconnection Networks for Scalable Quantum Computers
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 43rd annual Design Automation Conference
Explanation of Performance Degradation in Turn Model
The Journal of Supercomputing
Comparison of Mesh and Hierarchical Networks for Multiprocessors
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Quantum-like effects in network-on-chip buffers behavior
Proceedings of the 44th annual Design Automation Conference
Dynamic channel selection: an efficient strategy for balancing traffic in meshes
International Journal of Computational Science and Engineering
Combinatorial performance modelling of toroidal cubes
Journal of Systems Architecture: the EUROMICRO Journal
Synthesis of predictable networks-on-chip-based interconnect architectures for chip multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
An accurate and efficient performance analysis approach based on queuing model for Network on Chip
Proceedings of the 2009 International Conference on Computer-Aided Design
Performance modeling of n-dimensional mesh networks
Performance Evaluation
Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Modeling the effects of hot-spot traffic load on the performance of wormhole-switched hypermeshes
Computers and Electrical Engineering
An analytical model for Network-on-Chip with finite input buffer
Frontiers of Computer Science in China
Abstraction-based performance verification of NoCs
Proceedings of the 48th Design Automation Conference
Hi-index | 0.00 |
This paper develops detailed analytical performance models for k-ary n-cube networkswith single-hit or infinite buffers, wormhole routing, and the nonadaptive deadlock-freerouting scheme proposed by Dally and Seitz (1987). In contrast to previous performancestudies of such networks, the system is modeled as a closed queueing network that:includes the effects of blocking and pipelining of messages in the network; allows forarbitrary source-destination probability distributions; and explicitly models the virtualchannels used in the deadlock-free routing algorithm. The models are used to examineseveral performance issues for 2-D networks with shared-memory traffic. These resultsshould prove useful for engineering high-performance systems based on low-dimensional k-ary n-cube networks.