Stream image processing on a dual-core embedded system

Authors:
Michael G. Benjamin;David Kaeli
Affiliations:
Northeastern University, Computer Architecture Research Laboratory, Dana Research Center, Boston, MA;Northeastern University, Computer Architecture Research Laboratory, Dana Research Center, Boston, MA
Venue:
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Year:
2007

Citing 16
Cited 0

Hitting the memory wall: implications of the obvious

ACM SIGARCH Computer Architecture News
A bandwidth-efficient architecture for media processing

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Exploring multimedia applications locality to improve cache performance

MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A stream compiler for communication-exposed architectures

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Baring It All to Software: Raw Machines

Computer
Imagine: Media Processing with Streams

IEEE Micro
Scratchpad memory: design alternative for cache on-chip memory in embedded systems

Proceedings of the tenth international symposium on Hardware/software codesign
A programming system for the imagine media processor

A programming system for the imagine media processor
Programmable Stream Processors

Computer
TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP

ACM Transactions on Architecture and Code Optimization (TACO)
Stream Processors: Progammability and Efficiency

Queue - DSPs
Brook for GPUs: stream computing on graphics hardware

ACM SIGGRAPH 2004 Papers
Merrimac: Supercomputing with Streams

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Stream Programming on General-Purpose Processors

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Improving data cache performance with integrated use of split caches, victim cache and stream buffers

MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Exploiting Cache in Multimedia

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Effective memory utilization is critical to reap the benefits of the multi-core processors emerging on embedded systems. In this paper we explore the use of a stream model to effectively utilize memory hierarchies.We target image processing algorithms running on the Analog Devices Blackfin BF561 fixedpoint, dual-core DSP. Using optimized assembly to effectively use cores reduces runtime, but also underscores the need to mitigate the memory bottleneck. Like other embedded processors, the Blackfin BF561 has L2 SRAM available. Applying the stream model allows us to effectively make full use of both cores and the L2 SRAM. We achieve almost a 10X speedup in execution time compared to non-optimized C code.