A scalable and flexible data synchronization scheme for embedded HW-SW shared-memory systems
Proceedings of the 14th international symposium on Systems synthesis
Proceedings of the 41st annual Design Automation Conference
ACSD '06 Proceedings of the Sixth International Conference on Application of Concurrency to System Design
Throughput Analysis of Synchronous Data Flow Graphs
ACSD '06 Proceedings of the Sixth International Conference on Application of Concurrency to System Design
Multi-processor system design with ESPAM
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
pn: a tool for improved derivation of process networks
EURASIP Journal on Embedded Systems
Decoupling of Computation and Communication with a Communication Assist
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
The cell broadband engine: exploiting multiple levels of parallelism in a chip multiprocessor
International Journal of Parallel Programming
Efficient hardware code generation for FPGAs
ACM Transactions on Architecture and Code Optimization (TACO)
Throughput-Buffering Trade-Off Exploration for Cyclo-Static and Synchronous Dataflow Graphs
IEEE Transactions on Computers
Optimus: efficient realization of streaming applications on FPGAs
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Requirements on the execution of Kahn process networks
ESOP'03 Proceedings of the 12th European conference on Programming
A predictable communication assist
Proceedings of the 7th ACM international conference on Computing frontiers
Journal of Systems Architecture: the EUROMICRO Journal
Automated bottleneck-driven design-space exploration of media processing systems
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Feasibility analysis of ultra high frame rate visual servoing on FPGA and SIMD processor
ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Hi-index | 0.00 |
Streaming applications are an important class of applications in emerging embedded systems such as smart camera network, unmanned vehicles, and industrial printing. These applications are usually very computationally intensive and have real-time constraints. To meet the increasing demand for performance and efficiency in these applications, the use of application specific IP cores in heterogeneous Multi-Processor System-on-Chips (MPSoCs) becomes inevitable. However, two of the key challenges in integrating these IP cores into MPSoCs are (i) how to properly handle inter-core communication; (ii) how to map streaming applications in an efficient and predictable way. In this paper, we first present a predictable high-performance communication assist (CA) that helps to tackle these design challenges. The proposed CA has zero throughput overhead, negligible latency overhead, and significantly less resource usage compared to existing CA designs. The proposed CA also provides a unified abstract interface for both processors and accelerator IP cores with flexible data access support. Based on the proposed CA design, we present a predictable heterogeneous multi-processor platform template for streaming applications. The template is used in a predictable design flow that uses Synchronous Data Flow (SDF) graphs for design time analysis. An accurate SDF model of our CA is introduced, enabling the mapping of applications onto heterogeneous MPSoCs in an efficient and predictable way. As a case study, we map the complete high-speed vision processing pipeline of an industrial application, Organic Light Emitting Diode (OLED) screen printing, onto one instance of the proposed platform. The result demonstrates that system design and analysis effort is greatly reduced with the proposed CA-based design flow.