LATA: a latency and throughput-aware packet processing system

Authors:
Jilong Kuang;Laxmi Bhuyan
Affiliations:
Unviersity of California, Riverside, CA;Unviersity of California, Riverside, CA
Venue:
Proceedings of the 47th Design Automation Conference
Year:
2010

Citing 15
Cited 0

The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Locking effects in multiprocessor implementations of protocols

SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
On parallelizing and optimizing the implementation of communication protocols

IEEE/ACM Transactions on Networking (TON)
Static scheduling algorithms for allocating directed task graphs to multiprocessors

ACM Computing Surveys (CSUR)
A comparison of list schedules for parallel processing systems

Communications of the ACM
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Task Scheduling in Multiprocessing Systems

Computer
On the Granularity and Clustering of Directed Acyclic Task Graphs

IEEE Transactions on Parallel and Distributed Systems
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors

IEEE Transactions on Parallel and Distributed Systems
Shangri-La: achieving high performance from compiled network applications while enabling ease of programming

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Automatically partitioning packet processing applications for pipelined architectures

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Task Scheduling for Parallel Systems (Wiley Series on Parallel and Distributed Computing)

Task Scheduling for Parallel Systems (Wiley Series on Parallel and Distributed Computing)
Program mapping onto network processors by recursive bipartitioning and refining

Proceedings of the 44th annual Design Automation Conference
Automated task distribution in multicore network processors using statistical analysis

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
A scalable multithreaded L7-filter design for multi-core servers

Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current packet processing systems only aim at producing high throughput without considering packet latency reduction. For many real-time embedded network applications, it is essential that the processing time not exceed a given threshold. In this paper, we propose LATA, a LAtency and Throughput-Aware packet processing system for multicore architectures. Based on parallel pipeline core topology, LATA can satisfy the latency constraint and produce high throughput by exploiting fine-grained task-level parallelism. We implement LATA on an Intel machine with two Quad-Core Xeon E5335 processors and compare it with four other systems (Parallel, Greedy, Random and Bipar) for six network applications. LATA exhibits an average of 36.5% reduction of latency and a maximum of 62.2% reduction of latency for URL over Random with comparable throughput performance.