Analysis of the parallel packet switch architecture

Authors:
Sundar Iyer;Nick W. McKeown
Affiliations:
Computer Systems Laboratory, Stanford University, Stanford, CA;Computer Systems Laboratory, Stanford University, Stanford, CA
Venue:
IEEE/ACM Transactions on Networking (TON)
Year:
2003

Citing 12
Cited 21

VirtualClock: a new traffic control algorithm for packet-switched networks

ACM Transactions on Computer Systems (TOCS)
A generalized processor sharing approach to flow control in integrated services networks: the single-node case

IEEE/ACM Transactions on Networking (TON)
Efficient fair queueing using deficit round robin

SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
A reliable and scalable striping protocol

Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
A performance comparison of contemporary DRAM architectures

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Switching and Traffic Theory for Integrated Broadband Networks

Switching and Traffic Theory for Integrated Broadband Networks
WF2Q: worst-case fair weighted fair queueing

INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 1
On the speedup required for combined input- and output-queued switching

Automatica (Journal of IFAC)
Load balanced Birkhoff-von Neumann switches, part II: multi-stage buffering

Computer Communications
Load balanced Birkhoff-von Neumann switches, part I: one-stage buffering

Computer Communications
Matching output queueing with a combined input/output-queued switch

IEEE Journal on Selected Areas in Communications
On the speedup required for work-conserving crossbar switches

IEEE Journal on Selected Areas in Communications

An evolutionary management scheme in high-performance packet switches

IEEE/ACM Transactions on Networking (TON)
Achieving High Performance to Support Multicast Traffic in a Parallel Packet Switch with Space Division Multiplexing Expansion

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
10 networking papers: recommended reading

ACM SIGCOMM Computer Communication Review
Overall Blocking Behavior Analysis of General Banyan-Based Optical Switching Networks

IEEE Transactions on Parallel and Distributed Systems
Fault Tolerant Interleaved Switching Fabrics For Scalable High-Performance Routers

IEEE Transactions on Parallel and Distributed Systems
Emulating output queueing with parallel packet switches

Computer Communications
Rate and delay guarantees provided by Clos packet switches with load balancing

IEEE/ACM Transactions on Networking (TON)
Evaluation of a Novel Load-Balancing Algorithm with Variable Granularity

ICA3PP '08 Proceedings of the 8th international conference on Algorithms and Architectures for Parallel Processing
A coordination scheduling mechanism to guarantee packet ordering in parallel packet switch

International Journal of Electronic Security and Digital Forensics
Parallel switch system with QoS guarantee for real-time traffic

Journal of Computer Science and Technology
A modularized control plane for BGP

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Design of parallel packet switch simulation system based on NS2

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
A parallel packet switch supporting differentiated QoS based on weighted layer assignment

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Flow-based packet-mode load-balancing for parallel packet switches

Journal of High Speed Networks
The concurrent matching switch architecture

IEEE/ACM Transactions on Networking (TON)
A parallel packet switch architecture with input-output-queued switches and buffering in the demultiplexors

ICCOM'06 Proceedings of the 10th WSEAS international conference on Communications
A tree-based distributed model for BGP route processing

HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Speedup requirements for output queuing emulation with a sliding-window parallel packet switch

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Performance evaluation of the parallel packet switch with a sliding window scheme

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
A parallel packet switch with multiplexors containing virtual input queues

Computer Communications
Caesar: a content router for high speed forwarding

Proceedings of the second edition of the ICN workshop on Information-centric networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Our work is motivated by the desire to design packet switches with large aggregate capacity and fast line rates. In this paper, we consider building a packet switch from multiple lower speed packet switches operating independently and in parallel. In particular, we consider a (perhaps obvious) parallel packet switch (PPS) architecture in which arriving traffic is demultiplexed overk identical lower speed packet switches, switched to the correct output port, then recombined (multiplexed) before departing from the system. Essentially, the packet switch performs packet-by-packet load balancing, or inverse multiplexing, over multiple independent packet switches. Each lower speed packet switch operates at a fraction of the line rate R. For example, each packet switch can operate at rateR/k. It is a goal of our work that all memory buffers in the PPS run slower than the line rate. Ideally,a PPS would share the benefits of an output-queued switch, i.e., the delay of individual packets could be precisely controlled, allowing the provision of guaranteed qualities of service.In this paper, we ask the question: Is it possible for a PPS to precisely emulate the behavior of an output-queued packet switch with the same capacity and with the same number of ports? We show that it is theoretically possible for a PPS to emulate a first-come first-served (FCFS) output-queued (OQ) packet switch if each lower speed packet switch operates at a rate of approximately 2R/k. We further show that it is theoretically possible for a PPS to emulate a wide variety of quality-of-service queueing disciplines if each lower speed packet switch operates at a rate of approximately 3R/k. It turns out that these results are impractical because of high communication complexity, but a practical high-performance PPS can be designed if we slightly relax our original goal and allow a small fixed-size coordination buffer running at the line rate in both the demultiplexer and the multiplexer. We determine the size of this buffer and show that it can eliminate the need for a centralized scheduling algorithm, allowing a full distributed implementation with low computational and communication complexity. Furthermore, we show that if the lower speed packet switch operates at a rate ofR/k (i.e., without speedup), the resulting PPS can emulate an FCFS-OQ switch within a delay bound.