Determining optimal grain size for efficient vector processing on SIMD image processing architectures

Authors:
Jongmyon Kim;D. Scott Wills;Linda M. Wills
Affiliations:
Chip Solution Center, Samsung Advanced Institute of Technology, Kyungki-do, South Korea;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia;School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia
Venue:
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Year:
2005

Citing 13
Cited 0

Vector quantization and signal compression

Vector quantization and signal compression
Color image processing and applications

Color image processing and applications
Heterogeneous architecture models for interconnect-motivated system design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
Digital Image Processing

Digital Image Processing
Low-Power Digital VLSI Design Circuits and Systems

Low-Power Digital VLSI Design Circuits and Systems
VIS Speeds New Media Processing

IEEE Micro
MMX Technology Extension to the Intel Architecture

IEEE Micro
Subword Parallelism with MAX-2

IEEE Micro
Digital Camera System on a Chip

IEEE Micro
An Efficient Algorithm for Out-of-Core Matrix Transposition

IEEE Transactions on Computers
Processor/Memory/Array Size Tradeoffs in the Design of SIMD Arrays for a Spatially Mapped Workload

CAMP '97 Proceedings of the 1997 Computer Architectures for Machine Perception (CAMP '97)
The impact of grain size on the efficiency of embedded SIMD image processing architectures

Journal of Parallel and Distributed Computing
Architectural enhancements for color image and video processing on embedded systems

Architectural enhancements for color image and video processing on embedded systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adaptable silicon area usage within an integrated pixel processing array is a key issue for embedded single instruction, multiple data (SIMD) image processing architectures due to limited chip resources and varying application requirements. In this regard, this paper explores the effects of varying the number of vector (multichannel) pixels mapped to each processing element (VPPE) within a SIMD architecture. The VPPE ratio has a significant impact on the overall area and energy efficiency of the computational array. Moreover, this paper evaluates the impact of our color-aware instruction set (CAX) on each VPPE configuration to identify ideal grain size for a given SIMD system extended with CAX. CAX supports parallel operations on two-packed 16-bit (6:5:5) YCbCr (luminance-chrominance) data in a 32-bit datapath processor, providing greater concurrency and efficiency for vector processing of color image sequences. Experimental results for 3-D vector quantization indicate that high processing performance with the lowest cost is achieved at VPPE = 16 with CAX.