IEEE Transactions on Pattern Analysis and Machine Intelligence - Special Issue on Industrial Machine Vision and Computer Vision Technology:8MPart
A preliminary evaluation of a massively parallel processor: GAPP
Microprocessing and Microprogramming
PAWS: A Performance Evaluation Tool for Parallel Computing Systems
Computer - Special issue on experimental research in computer architecture
The DARPA image understanding benchmark for parallel computers
Journal of Parallel and Distributed Computing
Communications of the ACM
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Exploiting SIMD parallelism in DSP and multimedia algorithms using the AltiVec technology
ICS '99 Proceedings of the 13th international conference on Supercomputing
Heterogeneous architecture models for interconnect-motivated system design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on system-level interconnect prediction
Low Power Digital CMOS Design
Power Aware Design Methodologies
Power Aware Design Methodologies
Subword Parallelism with MAX-2
IEEE Micro
Digital Camera System on a Chip
IEEE Micro
Processor/Memory/Array Size Tradeoffs in the Design of SIMD Arrays for a Spatially Mapped Workload
CAMP '97 Proceedings of the 1997 Computer Architectures for Machine Perception (CAMP '97)
Pursuing a Petaflop: Point Designs for 100 TF Computers Using PIM Technologies
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A System for Evaluating Performance and Cost of SIMD Array Designs
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Energy efficient CMOS microprocessor design
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
SIMPil: an OE integrated SIMD architecture for focal plane processing applications
MPPOI '96 Proceedings of the 3rd Conference on Massively Parallel Processing Using Optical Interconnections
Reconfigurable Processing: The Solution to Low-Power Programmable DSP
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Impact of Pixel per Processor Ratio on Embedded SIMD Architectures
ICIAP '01 Proceedings of the 11th International Conference on Image Analysis and Processing
Portable multimedia supercomputers: system architecture design and evaluation
Portable multimedia supercomputers: system architecture design and evaluation
Modeling technology impact on cluster microprocessor performance
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Design of a Massively Parallel Processor
IEEE Transactions on Computers
The multidimensional access memory in STARAN
IEEE Transactions on Computers - Special issue on parallel processors and processing
Retargeting Sequential Image-Processing Programs for Data Parallel Execution
IEEE Transactions on Software Engineering
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
Pixel-per-processing element (PPE) ratio-the amount of image data directly mapped to each processing element-has a significant impact on the area and energy efficiency of embedded SIMD architectures for image processing applications. This paper quantitatively evaluates the impact of PPE ratio on system performance and efficiency for focal-plane SIMD image processing architectures by comparing throughput, area efficiency, and energy efficiency for a range of common application kernels using architectural and workload simulation. While the impact of grain size is affected by the mix of executed instructions within an application program, the most efficient PPE ratio often does not occur at PE grain size extremes (i.e., one pixel per processor or one processor per image). In this study, a set of four image processing application tasks is implemented on eight different SIMD configurations. Each configuration has a different PPE ratio and a different amount of local memory. Cycle accurate simulation and analytical technology modeling allows assessment of execution performance, area efficiency, and energy efficiency for each configuration. Results show the highest area and energy efficiency are achieved at PPE ratios between 16 and 256. Using these evaluation techniques (application grain size retargeting combined with area and energy technology modeling), a new class of efficient, embedded SIMD architectures for image processing can be designed.