URPR—An extension of URCR for software pipelining
MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
“Combining” as a compilation technique for VLIW architectures
MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Selected papers of the second workshop on Languages and compilers for parallel computing
Microcode compaction with timing constraints
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
A compilation technique for software pipelining of loops with conditional jumps
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
GURPR—a method for global software pipelining
MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
A Development Environment for Horizontal Microcode
IEEE Transactions on Software Engineering
Compaction with General Synchronous Timing
IEEE Transactions on Software Engineering
Loop Quantization: an Analysis and Algorithm
Loop Quantization: an Analysis and Algorithm
A systolic array optimizing compiler
A systolic array optimizing compiler
Compaction-based parallelization
Compaction-based parallelization
Software pipelining: an evaluation of enhanced pipelining
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
GURPR*: a new global software pipelining algorithm
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Generating compilers for generated datapaths
EURO-DAC '94 Proceedings of the conference on European design automation
ACM Computing Surveys (CSUR)
Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs
ICS '97 Proceedings of the 11th international conference on Supercomputing
Resource widening versus replication: limits and performance-cost trade-off
ICS '98 Proceedings of the 12th international conference on Supercomputing
Quantitative Evaluation of Register Pressure on Software Pipelined Loops
International Journal of Parallel Programming
IEEE Transactions on Computers
Real-Time Imaging - Special issue on software engineering
A new register file access architecture for software pipelining in VLIW processors
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Hi-index | 0.00 |
Software pipelining can significantly increase the execution rate of loops. Each of the four major software pipelining algorithms takes a different approach to software pipelining. This paper discusses each method and explores some of the similarities and differences among the methods.On loops consisting of a single basic block, the Perfect Pipelining Algorithm [1] is the only software pipelining algorithm that currently achieves time optimality, in the absence of resource constraints. A technique for unrolling the loop before pipelining is presented as an improvement to software pipelining, as it can allow Lam's algorithm [2] to achieve time optimality for these restricted loops. Unrolling has an advantage over Perfect Pipelining because it can reduce the code space required for the software pipeline.