Exploiting instruction level parallelism in the presence of conditional branches
Exploiting instruction level parallelism in the presence of conditional branches
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Code selection for media processors with SIMD instructions
DATE '00 Proceedings of the conference on Design, automation and test in Europe
Exploiting superword level parallelism with multimedia instruction sets
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Compilation techniques for multimedia processors
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
A vectorizing compiler for multimedia extensions
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
The architecture of the DIVA processing-in-memory chip
ICS '02 Proceedings of the 16th international conference on Supercomputing
Automatic intra-register vectorization for the Intel architecture
International Journal of Parallel Programming
Subword Parallelism with MAX-2
IEEE Micro
Vectorization for SIMD architectures with alignment constraints
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Superword-Level Parallelism in the Presence of Control Flow
Proceedings of the international symposium on Code generation and optimization
Hi-index | 0.00 |
This paper addresses how to automatically generate code for multimedia extension architectures in the presence of conditionals. We evaluate the costs and benefits of exploiting branches on the aggregate condition codes associated with the fields of a superword (an aggregate object larger than a machine word) such as the branch-on-any instruction of the AltiVec. Branch-on-superword-condition-codes (BOSCC) instructions allow fast detection of aggregate conditions, an optimization opportunity often found in multimedia applications. This paper presents compiler analyses and techniques for generating efficient parallel code using BOSCC instructions. We evaluate our approach, which has been implemented in the SUIF compiler, through a set of experiments with multimedia benchmarks, and compare it with the default approach previously implemented in our compiler. Our experimental results show that using BOSCC instructions can result in better performance for applications where the aggregate condition codes of a superword often evaluate to the same value.