Boosting the performance of multimedia applications using SIMD instructions

Authors:
Weihua Jiang;Chao Mei;Bo Huang;Jianhui Li;Jiahua Zhu;Binyu Zang;Chuanqi Zhu
Affiliations:
Parallel Processing Institute, Fudan University, Shanghai, China;Parallel Processing Institute, Fudan University, Shanghai, China;Intel China Software Center, Intel China Ltd, Shanghai, China;Intel China Software Center, Intel China Ltd, Shanghai, China;Parallel Processing Institute, Fudan University, Shanghai, China;Parallel Processing Institute, Fudan University, Shanghai, China;Parallel Processing Institute, Fudan University, Shanghai, China
Venue:
CC'05 Proceedings of the 14th international conference on Compiler Construction
Year:
2005

Citing 14
Cited 3

Advanced compiler optimizations for supercomputers

Communications of the ACM - Special issue on parallelism
Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
A text-compression-based method for code size minimization in embedded systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Bidwidth analysis with application to silicon compilation

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Exploiting superword level parallelism with multimedia instruction sets

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Compilation techniques for multimedia processors

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
A vectorizing compiler for multimedia extensions

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Measuring the Performance of Multimedia Instruction Sets

IEEE Transactions on Computers
Transformatiing and Parallelizing ANSI C Programs using Pattern Recognition

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Compiling for SIMD Within a Register

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Designing the Agassiz Compiler for Concurrent Multithreaded Architectures

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Design and characterization of the Berkeley multimedia workload

Multimedia Systems
Automatic detection of saturation and clipping idioms

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing

Optimizing compiler for shared-memory multiple SIMD architecture

Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Optimizing techniques for saturated arithmetic with first-order linear recurrence

Proceedings of the 2009 ACM symposium on Applied Computing
Data pipeline optimization for shared memory multiple-SIMD architecture

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern processors' multimedia extensions (MME) provide SIMD ISAs to boost the performance of typical operations in multimedia applications. However, automatic vectorization support for them is not very mature. The key difficulty is how to vectorize those SIMD-ISA-supported idioms in source code in an efficient and general way. In this paper, we introduce a powerful and ex-tendable recognition engine to solve this problem, which only needs a small amount of rules to recognize many such idioms and generate efficient SIMD in-structions. We integrated this engine into the classic vectorization framework and obtained very good performance speedup for some real-life applications.