MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
MOM: a matrix SIMD instruction set architecture for multimedia applications
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
VIS Speeds New Media Processing
IEEE Micro
Subword Parallelism with MAX-2
IEEE Micro
Imagine: Media Processing with Streams
IEEE Micro
Measuring the Performance of Multimedia Instruction Sets
IEEE Transactions on Computers
Three-dimensional memory vectorization for high bandwidth media memory systems
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
FlexRAM: Toward an Advanced Intelligent Memory System
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Bottlenecks in Multimedia Processing with SIMD Style Extensions and Architectural Enhancements
IEEE Transactions on Computers
Accelerating Mobile Multimedia with Intel Wireless MMX" Technology
ISMSE '04 Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The CSI multimedia architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Accelerating Color Space Conversion Using Extended Subwords and the Matrix Register File
ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
IEEE Transactions on Computers
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Face detection with the modified census transform
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Hi-index | 0.00 |
Current multimedia extensions provide a mechanism for general-purpose processors to meet the growing performance demand of multimedia applications. However, the computing performance of these extensions is often limited for the design conceptions of the single data stream. This paper presents an architecture called ''multi-streaming SIMD'' that enables current multimedia extensions to simultaneously manipulate multiple data streams. To efficiently and flexibly realize the proposed architecture, an operation cell is designed by fusing the logic gates and the storage cells together. Multiple operation cells then are connected to compose a register file with the ability of performing SIMD operations called ''Multimedia Operation Storage Unit (MOSU)''. Further, many MOSUs are used to compose a multi-streaming SIMD computing engine that can simultaneously manipulate multiple data streams and exploit the subword parallelisms of the elements in each data stream. This paper also designs three instruction modes (global, coupling, and isolated modes) for programmers to dynamically configure the multi-streaming SIMD computing engine at the instruction level to manipulate different amounts of data streams. Simulation results show that when the multi-streaming SIMD architecture has four 4-register MOSUs, it provides a factor of 3.3x-5.5x performance enhancement for traditional MMX extensions on 12 multimedia kernels.