An introduction to randomized algorithms
Discrete Applied Mathematics - Special volume: combinatorics and theoretical computer science
Performance prediction of parallel processing systems: the PAMELA methodology
ICS '93 Proceedings of the 7th international conference on Supercomputing
Randomized algorithms
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Scheduling issues in high-performance computing
ACM SIGMETRICS Performance Evaluation Review
Analysis of a shared-memory multiprocessor via a novel queuing model
Journal of Systems Architecture: the EUROMICRO Journal
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
ACM Transactions on Computer Systems (TOCS)
Performance Tradeoffs in Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Scheduling DAG's for Asynchronous Multiprocessor Execution
IEEE Transactions on Parallel and Distributed Systems
Filtering Random Graphs to Synthesize Interconnection Networks with Multiple Objectives
IEEE Transactions on Parallel and Distributed Systems
Programmable Stream Processors
Computer
Design considerations for network processor operating systems
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Analysis of Memory Interference in Multiprocessors
IEEE Transactions on Computers
A General Model for Memory Interference in Multiprocessors
IEEE Transactions on Computers
Analysis of Network Processing Workloads
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
IP-address lookup using LC-tries
IEEE Journal on Selected Areas in Communications
Analysis of network processing workloads
Journal of Systems Architecture: the EUROMICRO Journal
Runtime resource allocation in multi-core packet processing systems
HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
Hi-index | 0.00 |
Network processors are heterogeneous system-on-chip multiprocessors that are optimized to perform packet forwarding and processing tasks at Gigabit data rates. To meet the performance demands of increasing link speeds and complex network applications, network processors are implemented with several dozen embedded processor cores and hardware accelerators that run multiple packet processing applications in parallel. The parallel nature of the processing system makes it increasingly difficult for application developers to understand and manage resources and map processing tasks to the hardware. To address this problem, we present a methodology for profiling and analyzing network processor applications, mapping processing tasks to a generalized network processor architecture, and analytically determining the expected throughput performance. The key novelty of this work is not only the adaptation of application analysis and mapping algorithms to heterogeneous network processors, but also that the entire process can be automated and hidden from the application developer. Starting with the analysis of a uniprocessor implementation of the application, the process yields a mapping of the partitioned application that shows best performance for a given network processor system. The simplicity of the proposed randomized mapping algorithm allows the use of this methodology in network processor runtime systems where dynamic reallocation of tasks is necessary but processing power is limited. We present results that show the effectiveness of the analysis and mapping methodology as well as its application to design space exploration.