Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Register allocation via clique separators
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Introduction to algorithms
The priority-based coloring approach to register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Register allocation via hierarchical graph coloring
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Whole-program optimization for time and space efficient threads
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Advanced compiler design and implementation
Advanced compiler design and implementation
C Compiler Design for an Industrial Network Processor
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Network processing in content inspection applications
Proceedings of the 14th international symposium on Systems synthesis
Building a robust software-based router using network processors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Inter-task register-allocation for static operating systems
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Experience with a retargetable compiler for a commercial network processor
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
WRAPS Scheduling and Its Efficient Implementation on Network Processors
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Effective Compilation Support for Variable Instruction Set Architecture
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Taming the IXP network processor
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
A pipelined memory architecture for high throughput network processors
Proceedings of the 30th annual international symposium on Computer architecture
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Effective thread management on network processors with compiler analysis
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Mobile Information Systems - Mobile and Wireless Networks
Compiler assisted dynamic management of registers for network processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Compiler-Supported Thread Management for Multithreaded Network Processors
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Modern network processors employ multi-threading to allow concurrency amongst multiple packet processing tasks. We studied the properties of applications running on the network processors and observed that their imbalanced register requirements across different threads at different program points could lead to poor performance. Many times application needs demand some threads to be more performance critical than others and thus by controlling the register allocation across threads one could impact the performance of the threads and get the desired performance properties for concurrent threads. This prompts our work.Our register allocator aims to distribute available registers to different threads according to their needs. The compiler analyzes the register needs of each thread both at the point of a context switch as well as internally. Compiler then designates some registers as shared and some as private to each thread. Shared registers are allocated across all threads explicitly by the compiler. Values that are live across a context switch can not be kept in shared registers due to safety reasons; thus, only those live ranges that are internal to the context switch can be safely allocated to shared registers. Spill can cause a context switch. and thus, the problems of context switch and allocation are closely coupled and we propose a solution to this problem. The proposed interference graphs (GIG,BIG,IIG) distinguish variables that must use a thread's private registers from those that can use shared registers. We first estimate the register requirement bounds, then reduce from the upper bound gradually to achieve a good register balance among threads. To reduce the register needs, move insertions are inserted at program points that split the live ranges or the nodes on the interference graph. We show that the lower bound is reachable via live range splitting and is adequate for our benchmark programs for simultaneously assigning them on different threads. As our objective, the number of move instructions is minimized.Empirical results show that the compiler is able to effectively control the register allocation across threads by maximizing the number of shared registers. Speed-up for performance critical threads ranges from 18 to 24% whereas degradation for performance of non-critical threads ranges only from 1 to 4%.