A new compilation technique for SIMD code generation across basic block boundaries

Authors:
Hiroaki Tanaka;Yutaka Ota;Nobu Matsumoto;Takuji Hieda;Yoshinori Takeuchi;Masaharu Imai
Affiliations:
Semiconductor Company, Saiwai-ku, Kawasaki, Japan;Semiconductor Company, Saiwai-ku, Kawasaki, Japan;Semiconductor Company, Saiwai-ku, Kawasaki, Japan;Osaka University, Suita, Osaka, Japan;Osaka University, Suita, Osaka, Japan;Osaka University, Suita, Osaka, Japan
Venue:
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Year:
2010

Citing 9
Cited 0

Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Exploiting superword level parallelism with multimedia instruction sets

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Graph-based code selection techniques for embedded processors

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic intra-register vectorization for the Intel architecture

International Journal of Parallel Programming
Vectorization for SIMD architectures with alignment constraints

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Superword-Level Parallelism in the Presence of Control Flow

Proceedings of the international symposium on Code generation and optimization
Multi-platform Auto-vectorization

Proceedings of the International Symposium on Code Generation and Optimization
Auto-vectorization of interleaved data for SIMD

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Generation of Pack Instruction Sequence for Media Processors Using Multi-Valued Decision Diagram

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although SIMD instructions are effective for many digital signal processing applications, current compilers cannot take full advantage of SIMD instructions. One factor inhibiting SIMD code generation is control flow structure; the target scope of SIMD code generation is currently limited to single basic block or loop that consists of single basic block. SIMD instructions cannot be mapped typically across basic block boundaries even if basic blocks inside the control structure have enough parallelism. In this paper, a new compilation technique to generate SIMD code without modifying control flow structure is proposed. The data dependency between basic blocks is exploited to generate SIMD instructions. The packing cost is introduced for effective vectorization to maintain data dependency across basic block boundaries. Experimental results show that the new SIMD code generation technique reduced 67% of dynamic execution cycles of inter prediction in H.264 decoder.