Parallel processing: a smart compiler and a dumb machine

Authors:
Joseph A. Fisher;John R. Ellis;John C. Ruttenberg;Alexandru Nicolau
Affiliations:
Yale University, New Haven, CT;Yale University, New Haven, CT;Yale University, New Haven, CT;Yale University, New Haven, CT
Venue:
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Year:
1984

Citing 7
Cited 52

Compiler construction: theory and practice (2nd ed.)

Compiler construction: theory and practice (2nd ed.)
Postpass Code Optimization of Pipeline Constraints

ACM Transactions on Programming Languages and Systems (TOPLAS)
Computer Methods for Mathematical Computations

Computer Methods for Mathematical Computations
Very Long Instruction Word architectures and the ELI-512

ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Using an oracle to measure potential parallelism in single instruction stream programs

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Parallelism, memory anti-aliasing and correctness for trace scheduling compilers (disambiguation, flow-analysis, compaction)

Parallelism, memory anti-aliasing and correctness for trace scheduling compilers (disambiguation, flow-analysis, compaction)
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Compile-time partitioning and scheduling of parallel programs

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
A development environment for horizontal microcode programs

MICRO 19 Proceedings of the 19th annual workshop on Microprogramming
A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Semantic parallelization: a practical exercise in abstract interpretation

POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Modeling of parallel software for efficient computation communication overlap

ACM '87 Proceedings of the 1987 Fall Joint Computer Conference on Exploring technology: today and tomorrow
A VLIW architecture for a trace Scheduling Compiler

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Optimization of horizontal microcode generation for loop structures

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Crystal: from functional description to efficient parallel code

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Lazy data routing and greedy scheduling for application-specific signal processors

MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Squeezing more CPU performance out of a Cray-2 by Vector block scheduling

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Automatic generation of DAG parallelism

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Determining average program execution times and their variance

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Functional languages in microcode compilers

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Cost-effective design of application specific VLIW processors using the SCARCE framework

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
Automatic parallelization of APL-style programs

APL '90 Conference proceedings on APL 90: for the future
Limits of instruction-level parallelism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Predicting program behavior using real or estimated profiles

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The Marion system for retargetable instruction scheduling

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
MOVE: a framework for high-performance processor design

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Optimally profiling and tracing programs

POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A VLIW architecture for optimal execution of branch-intensive loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Branch prediction for free

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Avoidance and suppression of compensation code in a trace scheduling compiler

ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimally profiling and tracing programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Improving balanced scheduling with compiler optimizations that increase instruction-level parallelism

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Fast, effective dynamic compilation

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Whole program paths

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
A partitioning algorithm for system-level synthesis

ICCAD '92 Proceedings of the 1992 IEEE/ACM international conference on Computer-aided design
An investigation of static versus dynamic scheduling

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
SHAPE: a highly adaptable and parallel system

CSC '86 Proceedings of the 1986 ACM fourteenth annual conference on Computer science
Optimized unrolling of nested loops

Proceedings of the 14th international conference on Supercomputing
Optimized Unrolling of Nested Loops

International Journal of Parallel Programming
Run-Time Disambiguation: Coping with Statically Unpredictable Dependencies

IEEE Transactions on Computers
An Empirical Study of Fortran Programs for Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
Efficient Processor Assignment Algorithms and Loop Transformations for Executing Nested Parallel Loops on Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A Development Environment for Horizontal Microcode

IEEE Transactions on Software Engineering
Non-deterministic Processors

ACISP '01 Proceedings of the 6th Australasian Conference on Information Security and Privacy
Code Generation for Multi-Threaded Architectures from Dataflow Graphs

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Contribution of Compilation Techniques to the Synthesis of Dedicated VLIW Architectures

PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Comparing Tail Duplication with Compensation Code in Single Path Global Instruction Scheduling

CC '01 Proceedings of the 10th International Conference on Compiler Construction
Compiler optimization on VLIW instruction scheduling for low power

ACM Transactions on Design Automation of Electronic Systems (TODAES)
The Use of Feedback in Scheduling Parallel Computations

PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Register allocation for optimal loop scheduling

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Software pipelining: an effective scheduling technique for VLIW machines

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Predicting program behavior using real or estimated profiles

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Scaling to the End of Silicon with EDGE Architectures

Computer
Static Placement, Dynamic Issue (SPDI) Scheduling for EDGE Architectures

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
A spatial path scheduling algorithm for EDGE architectures

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
WCET-aware re-scheduling register allocation for real-time embedded systems with clustered VLIW architecture

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiprocessors and vector machines, the only successful parallel architectures, have coarse-grained parallelism that is hard for compilers to take advantage of. We've developed a new fine-grained parallel architecture and a compiler that together offer order-of-magnitude speedups for ordinary scientific code.