Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Exploiting superword level parallelism with multimedia instruction sets
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Graph-based code selection techniques for embedded processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic intra-register vectorization for the Intel architecture
International Journal of Parallel Programming
Vectorization for SIMD architectures with alignment constraints
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Superword-Level Parallelism in the Presence of Control Flow
Proceedings of the international symposium on Code generation and optimization
Multi-platform Auto-vectorization
Proceedings of the International Symposium on Code Generation and Optimization
Auto-vectorization of interleaved data for SIMD
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Generation of Pack Instruction Sequence for Media Processors Using Multi-Valued Decision Diagram
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Hi-index | 0.00 |
Although SIMD instructions are effective for many digital signal processing applications, current compilers cannot take full advantage of SIMD instructions. One factor inhibiting SIMD code generation is control flow structure; the target scope of SIMD code generation is currently limited to single basic block or loop that consists of single basic block. SIMD instructions cannot be mapped typically across basic block boundaries even if basic blocks inside the control structure have enough parallelism. In this paper, a new compilation technique to generate SIMD code without modifying control flow structure is proposed. The data dependency between basic blocks is exploited to generate SIMD instructions. The packing cost is introduced for effective vectorization to maintain data dependency across basic block boundaries. Experimental results show that the new SIMD code generation technique reduced 67% of dynamic execution cycles of inter prediction in H.264 decoder.