Static speculation as post-link optimization for the Grid Alu processor

Authors:
Ralf Jahr;Basher Shehan;Sascha Uhrig;Theo Ungerer
Affiliations:
Institute of Computer Science, University of Augsburg, Augsburg, Germany;Institute of Computer Science, University of Augsburg, Augsburg, Germany;Institute of Computer Science, University of Augsburg, Augsburg, Germany;Institute of Computer Science, University of Augsburg, Augsburg, Germany
Venue:
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Year:
2010

Citing 14
Cited 0

Global instruction scheduling for superscalar machines

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Sentinel scheduling: a model for compiler-controlled speculative execution

ACM Transactions on Computer Systems (TOCS)
The multiflow trace scheduling compiler

The Journal of Supercomputing - Special issue on instruction-level parallelism
A heuristic for global code motion

ICYCS'93 Proceedings of the third international conference on Young computer scientists
Enhancing instruction level parallelism through compiler-controlled speculation

Enhancing instruction level parallelism through compiler-controlled speculation
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling

CC '01 Proceedings of the 10th International Conference on Compiler Construction
Swing Modulo Scheduling: A Lifetime-Sensitive Approach

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
On the Design Complexity of the Issue Logic of Superscalar Machines

EUROMICRO '98 Proceedings of the 24th Conference on EUROMICRO - Volume 1
Design of a Computer—The Control Data 6600

Design of a Computer—The Control Data 6600
Trace Scheduling: A Technique for Global Microcode Compaction

IEEE Transactions on Computers
Parallel operation in the control data 6600

AFIPS '64 (Fall, part II) Proceedings of the October 27-29, 1964, fall joint computer conference, part II: very high speed computer systems
An efficient algorithm for exploiting multiple arithmetic units

IBM Journal of Research and Development
Reconfigurable Grid Alu Processor: Optimization and Design Space Exploration

DSD '10 Proceedings of the 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we propose and evaluate a post-link-optimization to increase instruction level parallelism by moving instructions from one basic block to the preceding blocks. The Grid Alu Processor used for the evaluations comprises plenty of functional units that are not completely allocated by the original instruction stream. The proposed technique speculatively performs operations in advance by using unallocated functional units. The algorithm moves instructions to multiple predecessors of a source block. If necessary, it adds compensation code to allow the shifted instructions to work on unused registers, whose values will be copied into the original target registers at the time the speculation is resolved. Evaluations of the algorithm show a maximum speedup of factor 2.08 achieved on the Grid Alu Processor compared to the unoptimized version of the same program due to a better exploitation of the ILP and an optimized mapping of loops.