Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Digital Image Processing
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Programmable Stream Processors
Computer
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Dynamic Load Distribution in the Borealis Stream Processor
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
A reconfigurable architecture for load-balanced rendering
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Stream Programming on General-Purpose Processors
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Dynamic load balancing for distributed search
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Cell/B.E. blades: building blocks for scalable, real-time, interactive, and digital media servers
IBM Journal of Research and Development
Orchestrating the execution of stream programs on multicore platforms
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
The connection-then-credit flow control protocol for heterogeneous multicore systems-on-chip
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special issue on the 2009 ACM/IEEE international symposium on networks-on-chip
On the costs and benefits of stochasticity in stream processing
Proceedings of the 47th Design Automation Conference
Adaptive task duplication using on-line bottleneck detection for streaming applications
Proceedings of the 9th conference on Computing Frontiers
Auto-parallelizing stateful distributed streaming applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Flexible filters in stream programs
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Stream processing is a promising paradigm for programming multi-core systems for high-performance embedded applications. We propose flexible filters as a technique that combines static mapping of the stream program tasks with dynamic load balancing of their execution. The goal is to improve the system-level processing throughput of the program when it is executed on a distributed-memory multi-core system as well as the local (core-level) memory utilization. Our technique is distributed and scalable because it is based on point-to-point handshake signals exchanged between neighboring cores. Load balancing with flexible filters can be applied to stream applications that present large dynamic variations in the computational load of their tasks and the dimension of the stream data tokens. In order to demonstrate the practicality of our technique, we present performance improvements for the case study of a JPEG encoder running on the IBM Cell multi-core processor.