Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
Exploiting a new level of DLP in multimedia applications
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Internet Streaming SIMD Extensions
Computer
VIS Speeds New Media Processing
IEEE Micro
Subword Parallelism with MAX-2
IEEE Micro
Imagine: Media Processing with Streams
IEEE Micro
Measuring the Performance of Multimedia Instruction Sets
IEEE Transactions on Computers
FlexRAM: Toward an Advanced Intelligent Memory System
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements
IEEE Transactions on Computers
Accelerating Mobile Multimedia with Intel Wireless MMX" Technology
ISMSE '04 Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The CSI multimedia architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Accelerating Color Space Conversion Using Extended Subwords and the Matrix Register File
ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
IEEE Transactions on Computers
Face detection with the modified census transform
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Color-Aware Instructions for Embedded Superscalar Processors
Journal of Signal Processing Systems
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
An out-of-order vector processing mechanism for multimedia applications
Proceedings of the 9th conference on Computing Frontiers
Design space exploration in many-core processors for sound synthesis of plucked string instruments
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Current MMX-like extensions provide a mechanism for general purpose processors to meet the growing performance demand of multimedia applications. However, the computing performance of these extensions is often limited because they only operate on a single data stream. To overcome this obstacle, this paper presents an architecture named "multi-streaming SIMD architecture" that enables one SIMD instruction to simultaneously manipulate multiple data streams. The proposed architecture is a Processor-In-Memory-like register-file architecture including SIMD operating logics for general-purposed processors to further extend current MMX-like extensions to obtain high performance. To efficiently and flexibly realize the proposed architecture, an operation cell is designed by fusing the logic gates and the storage cells together. The operation cells then are used to compose a register file with the ability of performing SIMD operations called "Multimedia Operation Storage Unit (MOSU)". Further, many MOSUs are used to compose a multi-streaming SIMD computing engine that can simultaneously manipulate multiple data streams and exploit the subword parallelisms of the elements in each data stream. Three instruction modes (global, coupling, and isolated modes) are defined for the MMX-like extensions to modulate the amount of parallel data streams and to efficiently utilize the computation resources. Simulation results show that when the multi-streaming SIMD architecture has four 4-register MOSUs, it provides a factor of 3.3x to 5.5x performance improvement compared with Intel's MMX extensions on eleven multimedia kernels.