A video signal processor for MIMD multiprocessing
DAC '98 Proceedings of the 35th annual Design Automation Conference
ARM Architecture Reference Manual
ARM Architecture Reference Manual
Memory Design and Exploration for Low Power, Embedded Systems
Journal of VLSI Signal Processing Systems - Special issue on signal processing systems design and implementation
Distributing SoC Simulations over a Network of Computers
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
HIBI Communication Network for System-on-Chip
Journal of VLSI Signal Processing Systems
UML-based multiprocessor SoC design framework
ACM Transactions on Embedded Computing Systems (TECS)
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
A methodology to evaluate memory architecture design tradeoffs for video signal processors
IEEE Transactions on Circuits and Systems for Video Technology
VLSI implementations of image and video multimedia processing systems
IEEE Transactions on Circuits and Systems for Video Technology
A survey of media processing approaches
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
Evolving video coding standards demand functional flexibility for implementations, not only at design time but also after fabrication. This paper presents a System-on-Chip design approach with a feasible combination of performance, scalability, programmability, area efficiency, and design time effort for a video encoder. The encoder is based on a homogeneous master-slave processor architecture. Each slave encodes a part of the frame in the Single Program Multiple Data (SPMD) data parallel model. Both shared and distributed memory architectures are presented. Design effort is reduced by identical program codes, automated assembly of software and hardware modules independent of the number and type of processors, as well as our flexible on-chip communication network called Heterogeneous IP Block Interconnection (HIBI). A case study implementation with two to ten simple ARM7 processors, 32-bit HIBI bus and non-optimized processor-independent software gives the performance from 6 to 53 fps for QCIF. The whole encoder area ranges from 173 to 770 kgates excluding the memories. The relation scales reasonably well to systems with more powerful processors and optimized code. The optimization of the communication network shows that with more than six slaves even a serial HIBI connection with 100 MHz speed is feasible. HIBI and the parallelization approach allow exploration and optimization of the communication both at the application and architecture layers.