Pipelined heap (priority queue) management for advanced scheduling in high-speed networks

  • Authors:
  • Aggelos Ioannou;Manolis G. H. Katevenis

  • Affiliations:
  • Computer Architecture and VLSI Systems Laboratory, Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, GR, Greece and Department of Computer Sci ...;Computer Architecture and VLSI Systems Laboratory, Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, GR, Greece and Department of Computer Sci ...

  • Venue:
  • IEEE/ACM Transactions on Networking (TON)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Per-flow queueing with sophisticated scheduling is one of the methods for providing advanced quality of service (QoS) guarantees. The hardest and most interesting scheduling algorithms rely on a common computational primitive, implemented via priority queues. To support such scheduling for a large number of flows at OC-192 (10 Gb/s) rates and beyond, pipelined management of the priority queue is needed. Large priority queues can be built using either calendar queues or heap data structures; heaps feature smaller silicon area than calendar queues. We present heap management algorithms that can be gracefully pipelined; they constitute modifications of the traditional ones. We discuss how to use pipelined heap managers in switches and routers and their cost-performance tradeoffs. The design can be configured to any heap size, and, using 2-port 4-wide SRAMs, it can support initiating a new operation on every clock cycle, except that an insert operation or one idle (bubble) cycle is needed between two successive delete operations. We present a pipelined heap manager implemented in synthesizable Verilog form, as a core integratable into ASICs, along with cost and performance analysis information. For a 16 K entry example in 0.13-µm CMOS technology, silicon area is below 10 mm2 (less than 8% of a typical ASIC chip) and performance is a few hundred million operations per second. We have verified our design by simulating it against three heap models of varying abstraction.