Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Efficient instruction scheduling for a pipelined architecture
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
MIPS RISC architecture
Instruction scheduling for the IBM RISC System/6000 processor
IBM Journal of Research and Development
Scheduling time-critical instructions on RISC machines
POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free caches in high-performance multiprocessors
Journal of Parallel and Distributed Computing
High-bandwidth data memory systems for superscalar processors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
ICS '90 Proceedings of the 4th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
MC88100 Microprocessors User's Manual
MC88100 Microprocessors User's Manual
Code generation and reorganization in the presence of pipeline constraints
POPL '82 Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Hi-index | 0.00 |
Traditional list schedulers order instructions based on an optimistic estimate of the load delay imposed by the implementation. Therefore they cannot respond to variations in load latencies (due to cache hits or misses, congestion in the memory interconnect, etc.) and cannot easily be applied across different implementations. We have developed an alternative algorithm, known as balanced scheduling, that schedules instructions based on an estimate of the amount of instruction level parallelism in the program. Since scheduling decisions are program- rather than machine-based, balanced scheduling is unaffected by implementation changes. Since it is based on the amount of instruction level parallelism that a program can support, it can respond better to variations in load latencies. Performance improvements over a traditional list scheduler on a Fortran workload and simulating several different machine types (cache-based workstations, large parallel machines with a multipath interconnect and a combination, all with non-blocking processors) are quite good, averaging between 3% and 18%.