A performance comparison of contemporary DRAM architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Proceedings of the 27th annual international symposium on Computer architecture
Memory arbitration and cache management in stream-based systems
DATE '00 Proceedings of the conference on Design, automation and test in Europe
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Proceedings of the 14th international symposium on Systems synthesis
High-Performance DRAMs in Workstation Environments
IEEE Transactions on Computers
Viper: A Multiprocessor SOC for Advanced Set-Top Box and Digital TV Systems
IEEE Design & Test
VLSI Architecture for Motion Estimation using the Block-Matching Algorithm
EDTC '96 Proceedings of the 1996 European conference on Design and Test
A Reconfigurable Hardware Platform for Digital Real-Time Signal Processing in Television Studios
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Evaluating the Imagine Stream Architecture
Proceedings of the 31st annual international symposium on Computer architecture
Real-Time Motion Estimation and Visualization on Graphics Cards
VIS '04 Proceedings of the conference on Visualization '04
Traffic shaping for an FPGA based SDRAM controller with complex QoS requirements
Proceedings of the 42nd annual Design Automation Conference
An Image Processor for Digital Film
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Evaluation of design alternatives for the 2-D-discrete wavelet transform
IEEE Transactions on Circuits and Systems for Video Technology
An efficient quality-aware memory controller for multimedia platform SoC
IEEE Transactions on Circuits and Systems for Video Technology
FlexWAFE - a high-end real-time stream processing library for FPGAs
Proceedings of the 44th annual Design Automation Conference
Improved response time analysis of tasks scheduled under preemptive Round-Robin
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
SAMOS '08 Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Application development with the FlexWAFE real-time stream processing architecture for FPGAs
ACM Transactions on Embedded Computing Systems (TECS)
Application-specific memory performance of a heterogeneous reconfigurable architecture
Proceedings of the Conference on Design, Automation and Test in Europe
Mapping of a film grain removal algorithm to a heterogeneous reconfigurable architecture
Proceedings of the Conference on Design, Automation and Test in Europe
A platform for high level synthesis of memory-intensive image processing algorithms
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
The Journal of Supercomputing
Application space exploration of a heterogeneous run-time configurable digital signal processor
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Digital film processing is characterized by a resolution of at least 2 K (2048 × 1536 pixels per frame at 30 bit/pixel and 24 pictures/s, data rate of 2.2 Gbit/s); higher resolutions of 4 K (8.8 Gbit/s) and even 8 K (35.2Gbit/s) are on their way. Real-time processing at this data rate is beyond the scope of today's standard and DSP processors, and ASICs are not economically viable due to the small market volume. Therefore, an FPGA-based approach was followed in the FlexFilm project. Different applications are supported on a single hardware platform by using different FPGA configurations. The multiboard, multi-FPGA hardware/software architecture, is based on Xilinx Virtex-II Pro FPGAs which contain the reconfigurable image stream processing data path, large SDRAM memories for multiple frame storage, and a PCI-Express communication backbone network. The FPGA-embedded CPU is used for control and less computation intensive tasks. This paper will focus on three key aspects: (a) the used design methodology which combines macro component configuration and macrolevel floorplaning with weak programmability using distributed microcoding, (b) the global communication framework with communication scheduling, and (c) the configurable multistream scheduling SDRAM controller with QoS support by access prioritization and traffic shaping. As an example, a complex noise reduction algorithm including a 2.5-dimension discrete wavelet transformation (DWT) and a full 16 × 16 motion estimation (ME) at 24 fps, requiring a total of 203 Gops/s net computing performance and a total of 28 Gbit/s DDR-SDRAM frame memory bandwidth, will be shown.