Scalable Tree-Based Architectures for IPv4/v6 Lookup Using Prefix Partitioning

Authors:
Hoang Le;Viktor K. Prasanna
Affiliations:
University of Southern California, Los Angeles;University of Southern California, Los Angeles
Venue:
IEEE Transactions on Computers
Year:
2012

Citing 0
Cited 3

LEAP: latency- energy- and area-optimized lookup pipeline

Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Scalable high-throughput architecture for large balanced tree structures on FPGA (abstract only)

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
An architecture for IPv6 lookup using parallel index generation units

ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications

Quantified Score

Hi-index	14.98

Visualization

Abstract

Memory efficiency and dynamically updateable data structures for Internet Protocol (IP) lookup have regained much interest in the research community. In this paper, we revisit the classic tree-based approach for solving the longest prefix matching (LPM) problem used in IP lookup. In particular, we target our solutions for a class of large and sparsely distributed routing tables, such as those potentially arising in the next-generation IPv6 routing protocol. Due to longer prefix lengths and much larger address space, preprocessing such routing tables for tree-based LPM can significantly increase the number of prefixes and/or memory stages required for IP lookup. We propose a prefix partitioning algorithm (DPP) to divide a given routing table into k groups of disjoint prefixes (k is given). The algorithm employs dynamic programming to determine the optimal split lengths between the groups to minimize the total memory requirement. Our algorithm demonstrates a substantial reduction in the memory footprint compared with those of the state of the art in both IPv4 and IPv6 cases. Two proposed linear pipelined architectures, which achieve high throughput and support incremental updates, are also presented. The proposed algorithm and architectures achieve a memory efficiency of 1 byte of memory for each byte of prefix for both IPv4 and IPv6. As a result, our design scales well to support either larger routing tables, longer prefix lengths, or both. The total memory requirement depends solely on the number of prefixes. Implementations on 45 nm ASIC and a state-of-the-art FPGA device (for a routing table consisting of 330K prefixes) show that our algorithm achieves 980 and 410 million lookups per second, respectively. These results are well suited for 100 Gbps lookup. The implementations also scale to support larger routing tables and longer prefix length when we go from IPv4 to IPv6. Additionally, the proposed architectures can easily interface with external SRAMs to ease the limitation of on-chip memory of the target devices.