Balanced scheduling: instruction scheduling when memory latency is uncertain

Authors:
Daniel R. Kerns;Susan J. Eggers
Affiliations:
-;-
Venue:
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Year:
1993

Citing 13
Cited 18

Bulldog: a compiler for VLSI architectures

Bulldog: a compiler for VLSI architectures
Efficient instruction scheduling for a pipelined architecture

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
MIPS RISC architecture

MIPS RISC architecture
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Instruction scheduling for the IBM RISC System/6000 processor

IBM Journal of Research and Development
Scheduling time-critical instructions on RISC machines

POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free caches in high-performance multiprocessors

Journal of Parallel and Distributed Computing
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
APRIL: a processor architecture for multiprocessing

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
MC88100 Microprocessors User's Manual

MC88100 Microprocessors User's Manual
Code generation and reorganization in the presence of pipeline constraints

POPL '82 Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture

Improving balanced scheduling with compiler optimizations that increase instruction-level parallelism

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Instruction scheduling and executable editing

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Predictability of load/store instruction latencies

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Cache sensitive modulo scheduling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Cache-conscious data placement

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Code transformations to improve memory parallelism

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Handling Global Constraints in Compiler Strategy

International Journal of Parallel Programming
Load Scheduling with Profile Information

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Static Scheduling of Instructions on Micronet-based Asynchronous Processors

ASYNC '96 Proceedings of the 2nd International Symposium on Advanced Research in Asynchronous Circuits and Systems
Efficient instruction scheduling for a pipelined architecture

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Queue Usage and Memory-Level Parallelism Sensitive Scheduling

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Instruction scheduling for a tiled dataflow architecture

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Latency-tolerant software pipelining in a production compiler

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Variable latency caches for nanoscale processor

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Engineering scalable, cache and space efficient tries for strings

The VLDB Journal — The International Journal on Very Large Data Bases
Redesigning the string hash table, burst trie, and BST to exploit cache

Journal of Experimental Algorithmics (JEA)
Feedback-Based global instruction scheduling for GPGPU applications

ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional list schedulers order instructions based on an optimistic estimate of the load delay imposed by the implementation. Therefore they cannot respond to variations in load latencies (due to cache hits or misses, congestion in the memory interconnect, etc.) and cannot easily be applied across different implementations. We have developed an alternative algorithm, known as balanced scheduling, that schedule instructions based on an estimate of the amount of instruction level parallelism in the program. Since scheduling decisions are program-rather than machine-based, balanced scheduling is unaffected by implementation changes. Since it is based on the amount of instruction level parallelism that a program can support, it can respond better to variations in load latencies. Performance improvements over a traditional list scheduler on a Fortran workload and simulating several different machine types (cache-based workstations, large parallel machines with a multipath interconnect and a combination, all with non-blocking processors) are quite good, averaging between 3% and 18%.