A new look at exploiting data parallelism in embedded systems

Authors:
Hillery C. Hunter;Jaime H. Moreno
Affiliations:
University of Illinois at Urbana-Champaign;T.J. Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Year:
2003

Citing 7
Cited 5

Experimental Application-Driven Architecture Analysis of an SIMD/MIMD Parallel Processing System

IEEE Transactions on Parallel and Distributed Systems
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
TriMedia CPU64 Architecture

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Evaluating Signal Processing and Multimedia Applications on SIMD, VLIW and Superscalar Architectures

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements

IEEE Transactions on Computers
An innovative low-power high-performance programmable signal processor for digital communications

IBM Journal of Research and Development
Overview of research efforts on media ISA extensions and their usage in video coding

IEEE Transactions on Circuits and Systems for Video Technology

SODA: A Low-power Architecture For Software Radio

Proceedings of the 33rd annual international symposium on Computer Architecture
Compiling for an indirect vector register architecture

Proceedings of the 5th conference on Computing frontiers
Performing real-time image processing on distributed computer systems

MUSP'10 Proceedings of the 10th WSEAS international conference on Multimedia systems & signal processing
Parallel image and video processing on distributed computer systems

WSEAS Transactions on Signal Processing
A low-power DSP for wireless communications

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes and evaluates three architectural methods for accomplishing data parallel computation in a programmable embedded system. Comparisons are made between the well-studied Very Long Instruction Word (VLIW) and Single Instruction Multiple Packed Data (SIMpD) paradigms; the less-common Single Instruction Multiple Disjoint Data (SIMdD) architecture is described and evaluated. A taxonomy is defined for data-level parallel architectures, and patterns of data access for parallel computation are studied, with measurements presented for over 40 essential telecommunication and media kernels. While some algorithms exhibit data-level parallelism suited to packed vector computation, it is shown that other kernels are most efficiently scheduled with more flexible vector models. This motivates exploration of non-traditional processor architectures for the embedded domain.