Compiling techniques for first-order liner recurrences on a Vector computer
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers
IEEE Transactions on Parallel and Distributed Systems
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Measuring the Performance of Multimedia Instruction Sets
IEEE Transactions on Computers
Proceedings of the 8th International Symposium on Parallel Processing
Design and characterization of the Berkeley multimedia workload
Multimedia Systems
Parallel Processing of First Order Linear Recurrence on SMP Machines
The Journal of Supercomputing
An Empirical Study On the Vectorization of Multimedia Applications for Multimedia Extensions
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Boosting the performance of multimedia applications using SIMD instructions
CC'05 Proceedings of the 14th international conference on Compiler Construction
Overflow controlled SIMD arithmetic
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Automatic detection of saturation and clipping idioms
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
Saturated arithmetic is a typical operation in multimedia applications, most multimedia extensions in the instruction set architecture (ISA) of modern processors provide saturation instructions for such operation. Therefore, extensive researches have focused on how to utilize saturation instructions to optimize programs. Previous algorithms mainly focus on purely saturated arithmetic, however saturated arithmetic is often mingled with first-order linear recurrence (FOLR) in real life applications. When FLOR pattern appears in the program, previous algorithms can not identify the saturated arithmetic as well. In fact, the saturated arithmetic with FOLR (SAWF) is a new and significant pattern, especially, SAWF with one as coefficient is frequently used in multimedia applications. Hence, it is necessary to explore a method with which such pattern can be efficiently vectorized. This paper discusses how to vectorize SAWF, explores the efficient method to vectorize SAWF with one as coefficient and gives its evaluation and implement a library for the optimizing technique. Such an implementation manner can make compilers are able to exploit it more easily. The experimental results shows the optimizing technique can achieve a speedup of 1.19 to 1.46 on Pentium IV processor. At the same time, the optimizing techniques in this paper can also be used to develop a library for SAWF so a programmer can benefit even without changing the compiler.