Heuristic Algorithms for Task Assignment in Distributed Systems
IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Compiler optimizations for asynchronous systolic array programs
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Scheduling for functional pipelining and loop winding
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
The formal semantics of programming languages: an introduction
The formal semantics of programming languages: an introduction
A Transformation System for Developing Recursive Programs
Journal of the ACM (JACM)
A System for Assisting Program Transformation
ACM Transactions on Programming Languages and Systems (TOPLAS)
High-level automatic pipelining for sequential circuits
Proceedings of the 14th international symposium on Systems synthesis
A Statically Allocated Parallel Functional Language
ICALP '00 Proceedings of the 27th International Colloquium on Automata, Languages and Programming
Taming the IXP network processor
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Towards provably correct hardware/software partitioning using occam
CODES '94 Proceedings of the 3rd international workshop on Hardware/software co-design
IBM PowerNP network processor: Hardware, software, and applications
IBM Journal of Research and Development
Programming by sketching for bit-streaming programs
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
FEADS: a framework for exploring the application design space on network processors
International Journal of Parallel Programming
EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
RouteBricks: exploiting parallelism to scale software routers
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
SPECTS'09 Proceedings of the 12th international conference on Symposium on Performance Evaluation of Computer & Telecommunication Systems
Thread to strand binding of parallel network applications in massive multi-threaded systems
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
An intrusion detection sensor for the NetVM virtual processor
ICOIN'09 Proceedings of the 23rd international conference on Information Networking
A throughput-driven task creation and mapping for network processors
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Scalable inter-vehicular applications
OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems - Volume Part II
Language-based optimisation of sensor-driven distributed computing applications
FASE'08/ETAPS'08 Proceedings of the Theory and practice of software, 11th international conference on Fundamental approaches to software engineering
RouteBricks: enabling general purpose network infrastructure
ACM SIGOPS Operating Systems Review
Exploiting Statically Schedulable Regions in Dataflow Programs
Journal of Signal Processing Systems
Multi-core application performance optimization using a constrained tandem queueing model
Journal of Network and Computer Applications
Programming language design and analysis motivated by hardware evolution
SAS'07 Proceedings of the 14th international conference on Static Analysis
Autonomous task partitioning in robot foraging: an approach based on cost estimation
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Hi-index | 0.00 |
Network processors (NPs) typically contain multiple concurrent processing cores. State-of-the-art programming techniques for NPs are invariably low-level, requiring programmers to partition code into concurrent tasks early in the design process. This results in programs that are hard to maintain and hard to port to alternative architectures. This paper presents a new approach in which a high-level program is separated from its partitioning into concurrent tasks. Designers write their programs in a high-level, domain-specific, architecturally-neutral language, but also provide a separate Architecture Mapping Script (AMS). An AMS specifies semantics-preserving transformations that are applied to the program to re-arrange it into a set of tasks appropriate for execution on a particular target architecture. We (i) describe three such transformations: pipeline introduction, pipeline elimination and queue multiplexing; and (ii) specify when each can be safely applied. As a case study we describe an IP packet-forwarder and present an AMS script that partitions it into a form capable of running at 3Gb/s on an Intel IXP2400 Network Processor.