Balanced scheduling: instruction scheduling when memory latency is uncertain

Authors:
Daniel R. Kerns;Susan J. Eggers
Affiliations:
Sand Point Engineering, Mercer Island, WA;University of Washington, Seattle, WA
Venue:
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Year:
2004

Citing 14
Cited 0

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Efficient instruction scheduling for a pipelined architecture

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
MIPS RISC architecture

MIPS RISC architecture
Instruction scheduling for the IBM RISC System/6000 processor

IBM Journal of Research and Development
Scheduling time-critical instructions on RISC machines

POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free caches in high-performance multiprocessors

Journal of Parallel and Distributed Computing
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Improving balanced scheduling with compiler optimizations that increase instruction-level parallelism

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
APRIL: a processor architecture for multiprocessing

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
MC88100 Microprocessors User's Manual

MC88100 Microprocessors User's Manual
Code generation and reorganization in the presence of pipeline constraints

POPL '82 Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional list schedulers order instructions based on an optimistic estimate of the load delay imposed by the implementation. Therefore they cannot respond to variations in load latencies (due to cache hits or misses, congestion in the memory interconnect, etc.) and cannot easily be applied across different implementations. We have developed an alternative algorithm, known as balanced scheduling, that schedules instructions based on an estimate of the amount of instruction level parallelism in the program. Since scheduling decisions are program- rather than machine-based, balanced scheduling is unaffected by implementation changes. Since it is based on the amount of instruction level parallelism that a program can support, it can respond better to variations in load latencies. Performance improvements over a traditional list scheduler on a Fortran workload and simulating several different machine types (cache-based workstations, large parallel machines with a multipath interconnect and a combination, all with non-blocking processors) are quite good, averaging between 3% and 18%.