“Combining” as a compilation technique for VLIW architectures

Authors:
T. Nakatani;K. Ebcioğlu
Affiliations:
IBM Tokyo Research Laboratory, 5-19 Sanbancho, Chiyoda-ku, Tokyo;IBM Thomas J. Watson Research Center, P.O.Box 218, Yorktown Heights, NY
Venue:
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Year:
1989

Citing 5
Cited 17

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture

Selected papers of the second workshop on Languages and compilers for parallel computing
A compilation technique for software pipelining of loops with conditional jumps

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Structure of Computers and Computations

Structure of Computers and Computations
Percolation Scheduling: A Parallel Compilation Technique

Percolation Scheduling: A Parallel Compilation Technique

Software pipelining: an evaluation of enhanced pipelining

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
A dynamic-programming technique for compacting loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Compiler code transformations for superscalar-based high performance systems

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
VLIW compilation techniques in a superscalar environment

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Instruction scheduling in the TOBEY compiler

IBM Journal of Research and Development
Software pipelining

ACM Computing Surveys (CSUR)
Resource-Constrained Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Unrolling-based optimizations for modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Software pipelining: a comparison and improvement

MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Using a lookahead window in a compaction-based parallelizing compiler

MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
A study on the number of memory ports in multiple instruction issue machines

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Parallelizing nonnumerical code with selective scheduling and software pipelining

ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimizations and oracle parallelism with dynamic translation

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Making Compaction-Based Parallelization Affordable

IEEE Transactions on Parallel and Distributed Systems
Software Pipelining: Petri Net Pacemaker

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Instruction combining for coalescing memory accesses using global code motion

MSP '04 Proceedings of the 2004 workshop on Memory system performance
Using a lookahead window in a compaction-based parallelizing compiler

ACM SIGMICRO Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Combining is a local compiler optimization technique that can enhance the performance of global compaction techniques for VLIW machines. Given two adjacent operations of a certain class that are flow (read-after-write) dependent and that cannot be placed in the same micro-instruction, the combining technique can transform the operations so that the modified operations have no dependence. The transformed operations can be executed in the same micro-instruction, thus allowing the total execution time of the program to be reduced. In this paper, combining a pair of flow-dependent operations into a wide instruction word is suggested as an important compilation technique for VLIW architectures. Combining is particularly effective with software pipelining and loop unrolling since combinable operations can come together with a higher probability when these compilation techniques are used. We implemented combining in our parallelizing compiler for the wide instruction word architecture, which is now being built at the IBM T. J. Watson Research Center. It is shown that ten percent speedup is obtained on the Stanford integer benchmarks and other sequential-matured C programs, in comparison to compaction techniques that do not use combining. For a class of inner loops, combining can remove the inter-iteration dependencies completely and can improve performance in the same ratio as the loop is unrolled.