Calendar queues: a fast 0(1) priority queue implementation for the simulation event set problem
Communications of the ACM
A VLSI priority packet queue with inheritance and overwrite
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Router Architecture for Real-Time Communication in Multicomputer Networks
IEEE Transactions on Computers
Providing Quality of Service Packet Switched Networks
Performance Evaluation of Computer and Communication Systems, Joint Tutorial Papers of Performance '93 and Sigmetrics '93
Hardware-efficient fair queueing architectures for high-speed networks
INFOCOM'96 Proceedings of the Fifteenth annual joint conference of the IEEE computer and communications societies conference on The conference on computer communications - Volume 2
Design issues and performance improvements in routing strategy on the internet workflow
International Journal of Network Management
Multistage-Based Switching Fabrics for Scalable Routers
IEEE Transactions on Parallel and Distributed Systems
Multitasking on reconfigurable architectures: microarchitecture support and dynamic scheduling
ACM Transactions on Embedded Computing Systems (TECS)
An evolutionary management scheme in high-performance packet switches
IEEE/ACM Transactions on Networking (TON)
Space priority queue with fuzzy set threshold
Computer Communications
Deadline-based scheduling in support of real-time data delivery
Computer Networks: The International Journal of Computer and Telecommunications Networking
FPGA based hardware scheduler for multiprocessor systems
ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
Hardware IP for scheduling of periodic tasks in multiprocessor systems
WSEAS Transactions on Computer Research
Run-time HW/SW scheduling of data flow applications on reconfigurable architectures
EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
Hardware supported task scheduling on dynamically reconfigurable SoC architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A hardware NIC scheduler to guarantee qos on high performance servers
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
XML-based policy engineering framework for heterogeneous network management
APNOMS'07 Proceedings of the 10th Asia-Pacific conference on Network Operations and Management Symposium: managing next generation networks and services
SENIC: scalable NIC for end-host rate limiting
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 14.98 |
With effective packet-scheduling mechanisms, modern integrated networks can support the diverse quality-of-service requirements of emerging applications. However, arbitrating between a large number of small packets on a high-speed link requires an efficient hardware implementation of a priority queue. To highlight the challenges of building scalable priority queue architectures, this paper includes a detailed comparison of four existing approaches: a binary tree of comparators, priority encoder with multiple first-in-first-out lists, shift register, and systolic array. Based on these comparison results, we propose two new architectures that scale to the large number of packets (N) and large number of priority levels (P) necessary in modern switch designs. The first architecture combines the faster clock speed of a systolic array with the lower memory requirements of a shift register, resulting in a hybrid design; a tunable parameter allows switch designers to carefully balance the trade-off between bus loading and chip area. We then extend this architecture to serve multiple output ports in a shared-memory switch. This significantly decreases complexity over the traditional approach of dedicating a separate priority queue to each outgoing link. Using the Verilog hardware description language and the Epoch silicon compiler, we have designed and simulated these two new architectures, as well as the four existing approaches. The simulation experiments compare the designs across a range of priority queue sizes and performance metrics, including enqueue/dequeue speed, chip area, and number of transistors.