Analysis of interconnection networks with different arbiter designs
Journal of Parallel and Distributed Computing
A Case for Direct-Mapped Caches
Computer
High speed switch scheduling for local area networks
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
METRO: a router architecture for high-performance, short-haul routing networks
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Scheduling algorithms for input-queued cell switches
Scheduling algorithms for input-queued cell switches
Performance of database workloads on shared-memory systems with out-of-order processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The iSLIP scheduling algorithm for input-queued switches
IEEE/ACM Transactions on Networking (TON)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A new switch chip for IBM RS/6000 SP systems
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Architecture and design of AlphaServer GS320
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
Asim: A Performance Model Framework
Computer
Spider: A High-Speed Network Interconnect
IEEE Micro
The Alpha 21364 Network Architecture
IEEE Micro
The Sun Fireplane Interconnect
IEEE Micro
The Use of Feedback in Multiprocessors and Its Application to Tree Saturation Control
IEEE Transactions on Parallel and Distributed Systems
Symmetric Crossbar Arbiters for VLSI Communication Switches
IEEE Transactions on Parallel and Distributed Systems
DRIL: Dynamically Reduced Message Injection Limitation Mechanism for Wormhole Networks
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Global Reactive Congestion Control in Multicomputer Networks
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
A High-Performance OC-12/OC-48 Queue Design Prototype for Input-buffered ATM Switches
INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Self-Tuned Congestion Control for Multiprocessor Networks
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Worst-case Traffic for Oblivious Routing Functions
IEEE Computer Architecture Letters
POWER4 system microarchitecture
IBM Journal of Research and Development
Exploring Virtual Network Selection Algorithms in DSM Cache Coherence Protocols
IEEE Transactions on Parallel and Distributed Systems
High Performance Matrix Multiplication on Many Cores
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Allocator implementations for network-on-chip routers
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Light speed arbitration and flow control for nanophotonic interconnects
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A low-latency modular switch for CMP systems
Microprocessors & Microsystems
Packet chaining: efficient single-cycle allocation for on-chip networks
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Dynamic-priority arbiter and multiplexer soft macros for on-chip networks switches
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Interconnection networks usually consist of a fabric of interconnected routers, which receive packets arriving at their input ports and forward them to appropriate output ports. Unfortunately, network packets moving through these routers are often delayed due to conflicting demand for resources, such as output ports or buffer space. Hence, routers typically employ arbiters that resolve conflicting resource demands to maximize the number of matches between packets waiting at input ports and free output ports. Efficient design and implementation of the algorithm running on these arbiters is critical to maximize network performance.This paper proposes a new arbitration algorithm called SPAA (Simple Pipelined Arbitration Algorithm), which is implemented in the Alpha 21364 processor's on-chip router pipeline. Simulation results show that SPAA significantly outperforms two earlier well-known arbitration algorithms: PIM (Parallel Iterative Matching) and WFA (Wave-Front Arbiter) implemented in the SGI Spider switch. SPAA outperforms PIM and WFA because SPAA exhibits matching capabilities similar to PIM and WFA under realistic conditions when many output ports are busy, incurs fewer clock cycles to perform the arbitration, and can be pipelined effectively. Additionally, we propose a new prioritization policy called the Rotary Rule, which prevents the network's adverse performance degradation from saturation at high network loads by prioritizing packets already in the network over new packets generated by caches or memory.