ROS-DMA: A DMA Double Buffering Method for Embedded Image Processing with Resource Optimized Slicing

Authors:
Christian Zinner;Wilfried Kubinger
Affiliations:
ARC Seibersdorf Research, Austria;ARC Seibersdorf Research, Austria
Venue:
RTAS '06 Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium
Year:
2006

Citing 0
Cited 10

Generalizing parametric timing analysis

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Pfelib: a performance primitives library for embedded vision

EURASIP Journal on Embedded Systems
Model-based design of an embedded vision application: a field report

SPPR'07 Proceedings of the Fourth conference on IASTED International Conference: Signal Processing, Pattern Recognition, and Applications
An Optimized Software-Based Implementation of a Census-Based Stereo Matching Algorithm

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
Model-based design of an embedded vision application: a field report

SPPRA '07 Proceedings of the Fourth IASTED International Conference on Signal Processing, Pattern Recognition, and Applications
Distributed real-time stereo matching on smart cameras

Proceedings of the Fourth ACM/IEEE International Conference on Distributed Smart Cameras
A fast stereo matching algorithm suitable for embedded real-time systems

Computer Vision and Image Understanding
Real-Time Adaptive Background Modeling for Multicore Embedded Systems

Journal of Signal Processing Systems
Cat-tail dma: efficient image data transport for multicore embedded mobile systems

Journal of Mobile Multimedia
Optimizing explicit data transfers for data parallel applications on the cell architecture

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Image processing on a Digital Signal Processor (DSP) often requires image data to be stored in external memory, because the amount of fast on-chip memory is usually very limited. Processing images in external memory causes significant performance drawbacks. This paper presents a double buffering method using Direct Memory Access (DMA), called Resource Optimized Slicing (ROS-DMA), which is intended to be used instead of a Level 2 (L2) data cache. The idea of ROS-DMA is to transfer image slices into small intermediate buffers of fast internal memory, where the processing can be completed utilizing the full processing power. Use of DMA enables the data transfers and the processing to be accomplished in parallel. The proposed method has the advantage of a modular implementation, making it easy to re-use components for various image processing operations. The sequence of transfers is organized in such a way that use of processor resources is optimized to achieve the shortest possible execution time. ROS-DMA can yield substantially better performance compared to using L2 cache. Furthermore, we expect that with ROS-DMA it will be easier to obtain reliable and tight Worst Case Execution Times (WCETs). Test runs achieved up to six times faster execution with ROS-DMA compared to using the L2 cache on a C6416 DSP from Texas Instruments.