Compiler-Supported Thread Management for Multithreaded Network Processors

Authors:
Xiaotong Zhuang;Santosh Pande
Affiliations:
IBM T.J. Watson Research Laboratory;Georgia Institute of Technology
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2011

Citing 20
Cited 0

Analysis and simulation of a fair queueing algorithm

SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
PLAN: a packet language for active networks

ICFP '98 Proceedings of the third ACM SIGPLAN international conference on Functional programming
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment

Journal of the ACM (JACM)
C Compiler Design for an Industrial Network Processor

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Network processing in content inspection applications

Proceedings of the 14th international symposium on Systems synthesis
Building a robust software-based router using network processors

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
WRAPS Scheduling and Its Efficient Implementation on Network Processors

HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Efficient Implementation of Packet Scheduling Algorithm on High-Speed Programmable Network Processors

MMNS '02 Proceedings of the 5th IFIP/IEEE International Conference on Management of Multimedia Networks and Services: Management of Multimedia on the Internet
Effective Compilation Support for Variable Instruction Set Architecture

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Voluntary Preemption: a Tool Tn The Design Of Hard Real-Time Systems

Proceedings of the Second International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems
Taming the IXP network processor

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Resolving Register Bank Conflicts for a Network Processor

Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Balancing register allocation across threads for a multithreaded network processor

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Shangri-La: achieving high performance from compiled network applications while enabling ease of programming

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Automatically partitioning packet processing applications for pipelined architectures

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
CommBench-a telecommunications benchmark for network processors

ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Dynamic Window-Constrained Scheduling for Multimedia Applications

ICMCS '99 Proceedings of the 1999 IEEE International Conference on Multimedia Computing and Systems - Volume 02
Analysis of a window-constrained scheduler for real-time and best-effort packet streams

RTSS'10 Proceedings of the 21st IEEE conference on Real-time systems symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditionally, runtime management involving CPU sharing, real-time scheduling, etc., is provided by the runtime environment (typically an operating system) using hardware support such as timers and interrupts. However, due to stringent performance requirements on network processors, neither OS nor hardware mechanisms are typically feasible/available. Mapping packet processing tasks on network processors involves complex trade-offs to maximize parallelism and pipelining. Due to an increase in the size of the code store and complexity of application requirements, network processors are being programmed with heterogeneous threads that may execute code belonging to different tasks on a given micro-engine. Also, most network applications are streaming applications that are typically processed in a pipelined fashion. Thus, the tasks on different micro-engines are pipelined in such a way as to maximize the throughput. Tasks themselves could have different runtime performance demands. In this article, we focus on network processors on which hardware can only schedule threads in a round-robin fashion and no OS assistance is provided. We show that it is very difficult and inefficient for the programmer to meet the constraints of runtime management by coding them statically. Due to the infeasibility of hardware or OS solution (even in the near future), we undertake a compiler approach. We propose a complete compiler solution to automatically insert explicit context switch (ctx) instructions provided on the network processor such that the execution of threads is better manipulated at runtime to meet their constraints. Two approaches are presented that can control programs’ runtime behavior with different applicability and overheads. We show that it is feasible and also opens new application domains that would need heterogeneous thread programming. Such approaches would in general become important for multicore processors. Finally, our experiments show that the runtime constraints are enforced nearly ideally with minimal runtime degradation and small code growth.