Scalable Hardware Priority Queue Architectures for High-Speed Packet Switches

Authors:
Sung-Whan Moon;Kang G. Shin;Jennifer Rexford
Affiliations:
Univ. of Michigan, Ann Arbor;Univ. of Michigan, Ann Arbor;AT&T Labs, Florham Park, NJ
Venue:
IEEE Transactions on Computers
Year:
2000

Citing 5
Cited 13

Calendar queues: a fast 0(1) priority queue implementation for the simulation event set problem

Communications of the ACM
A VLSI priority packet queue with inheritance and overwrite

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Router Architecture for Real-Time Communication in Multicomputer Networks

IEEE Transactions on Computers
Providing Quality of Service Packet Switched Networks

Performance Evaluation of Computer and Communication Systems, Joint Tutorial Papers of Performance '93 and Sigmetrics '93
Hardware-efficient fair queueing architectures for high-speed networks

INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 2

Design issues and performance improvements in routing strategy on the internet workflow

International Journal of Network Management
Multistage-Based Switching Fabrics for Scalable Routers

IEEE Transactions on Parallel and Distributed Systems
Multitasking on reconfigurable architectures: microarchitecture support and dynamic scheduling

ACM Transactions on Embedded Computing Systems (TECS)
An evolutionary management scheme in high-performance packet switches

IEEE/ACM Transactions on Networking (TON)
Space priority queue with fuzzy set threshold

Computer Communications
Deadline-based scheduling in support of real-time data delivery

Computer Networks: The International Journal of Computer and Telecommunications Networking
FPGA based hardware scheduler for multiprocessor systems

ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
Hardware IP for scheduling of periodic tasks in multiprocessor systems

WSEAS Transactions on Computer Research
Run-time HW/SW scheduling of data flow applications on reconfigurable architectures

EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
Hardware supported task scheduling on dynamically reconfigurable SoC architectures

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A hardware NIC scheduler to guarantee qos on high performance servers

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
XML-based policy engineering framework for heterogeneous network management

APNOMS'07 Proceedings of the 10th Asia-Pacific conference on Network Operations and Management Symposium: managing next generation networks and services
SENIC: scalable NIC for end-host rate limiting

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation

Quantified Score

Hi-index	14.98

Visualization

Abstract

With effective packet-scheduling mechanisms, modern integrated networks can support the diverse quality-of-service requirements of emerging applications. However, arbitrating between a large number of small packets on a high-speed link requires an efficient hardware implementation of a priority queue. To highlight the challenges of building scalable priority queue architectures, this paper includes a detailed comparison of four existing approaches: a binary tree of comparators, priority encoder with multiple first-in-first-out lists, shift register, and systolic array. Based on these comparison results, we propose two new architectures that scale to the large number of packets (N) and large number of priority levels (P) necessary in modern switch designs. The first architecture combines the faster clock speed of a systolic array with the lower memory requirements of a shift register, resulting in a hybrid design; a tunable parameter allows switch designers to carefully balance the trade-off between bus loading and chip area. We then extend this architecture to serve multiple output ports in a shared-memory switch. This significantly decreases complexity over the traditional approach of dedicating a separate priority queue to each outgoing link. Using the Verilog hardware description language and the Epoch silicon compiler, we have designed and simulated these two new architectures, as well as the four existing approaches. The simulation experiments compare the designs across a range of priority queue sizes and performance metrics, including enqueue/dequeue speed, chip area, and number of transistors.