Highly concurrent scalar processing
Highly concurrent scalar processing
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
“Combining” as a compilation technique for VLIW architectures
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Region Scheduling: An Approach for Detecting and Redistributing Parallelism
IEEE Transactions on Software Engineering
Selected papers of the second workshop on Languages and compilers for parallel computing
A compilation technique for software pipelining of loops with conditional jumps
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
A global resource-constrained parallelization technique
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Perfect Pipelining: A New Loop Parallelization Technique
ESOP '88 Proceedings of the 2nd European Symposium on Programming
Percolation Scheduling: A Parallel Compilation Technique
Percolation Scheduling: A Parallel Compilation Technique
Bulldog: a compiler for vliw architectures (parallel computing, reduced-instruction-set, trace scheduling, scientific)
Compaction-based parallelization
Compaction-based parallelization
An architectural framework for migration from CISC to higher performance platforms
ICS '92 Proceedings of the 6th international conference on Supercomputing
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Foresighted Instruction Scheduling Under Timing Constraints
IEEE Transactions on Computers
A novel framework of register allocation for software pipelining
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ACM Computing Surveys (CSUR)
Resource-Constrained Software Pipelining
IEEE Transactions on Parallel and Distributed Systems
A study on the number of memory ports in multiple instruction issue machines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Instruction Window Size Trade-Offs and Characterization of Program Parallelism
IEEE Transactions on Computers
Making Compaction-Based Parallelization Affordable
IEEE Transactions on Parallel and Distributed Systems
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Selective Scheduling Framework for Speculative Operations in VLIW and Superscalar Processors
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Register allocation for optimal loop scheduling
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Hi-index | 0.01 |
Lookahead is a common technique for high performance uniprocessor design. In general, however, hardware lookahead window is too small to exploit instruction-level parallelism at run time, while compaction-based parallelizing compilers must suffer from worst-case exponential code explosion at compile time. In this paper, we propose a software lookahead method, which allows inter-basic block code motions within the prespecified number of operations, called software lookahead window, on any path emanating from the currently processed instruction at compile time. By software lookahead, instruction-level parallelism can be exploited in a much greater code area than the hardware approach, but the lookahead region is still limited to a constant depth with a user-specifiable window, and thus code explosion is restricted. The proposed scheme has been implemented in our prototype parallelizing compiler, which can generate code for uniprocessors with multiple functional units and multiway conditional branches, such as VLIW machines, and potentially for superscalars as well. To study code explosion problem and instruction-level parallelism for branch intensive code, we compiled five AIX utilities: sort, fgrep, sed, yacc, and compress. It is demonstrated that, with software lookahead, code explosion problem is effectively alleviated, yet a substantial amount of inter-basic block parallelism is successfully extracted.