Detecting equality of variables in programs
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Superblock formation using static program analysis
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Advanced compiler design and implementation
Advanced compiler design and implementation
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Compiler Design Handbook: Optimizations and Machine Code Generation
Compiler Design Handbook: Optimizations and Machine Code Generation
Cluster assignment for high-performance embedded VLIW processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Hi-index | 0.00 |
Advances in semiconductor fabrication technology will continue to enable exponential increase in the number of transistors available. However, conventional architectures, such as superscalars or VLIWs, will not be able to use the abundant on-chip resources efficiently to achieve high performance because of their inherent lack of scalability. In order to overcome the deficiencies of conventional architectures, architects have come up with a new breed of processors that have distributed resources interconnected via sophisticated networks. Although these processors have the potential of achieving higher performance and being more power efficient than conventional processors, the distributed resources make it difficult to write a compiler that generates a high quality schedule for an application. In this paper, we propose a scheduling approach targeted for such distributed resource architectures. Our approach simultaneously places and schedules an operation and routes communications between consumer and producer operations. We introduce techniques to use the scarce interconnect resources efficiently and empirically to show that they contribute to speeding up applications. In addition, we present a simple yet flexible way to describe the target architecture and generate data structures used for scheduling that represents the interconnect structure of the target architecture.