A Scheme to Enforce Data Dependence on Large Multiprocessor Systems
IEEE Transactions on Software Engineering
High-performance computer architecture
High-performance computer architecture
Compiler algorithms for synchronization
IEEE Transactions on Computers
Experience Using Multiprocessor Systems—A Status Report
ACM Computing Surveys (CSUR)
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Multiprocessor Synchronization for Concurrent Loops
IEEE Software
Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing)
Run-time parallelization and scheduling of loops
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Run-Time Parallelization and Scheduling of Loops
IEEE Transactions on Computers
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Run-time methods for parallelizing partially parallel loops
ICS '95 Proceedings of the 9th international conference on Supercomputing
Compiler techniques for data synchronization in nested parallel loops
ICS '90 Proceedings of the 4th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Techniques for speculative run-time parallelization of loops
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
An efficient algorithm for the run-time parallelization of DOACROSS loops
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
The Illinois Aggressive Coma Multiprocessor project (I-ACOMA)
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Hi-index | 0.00 |
This paper proposes an approach to minimally constrained synchronization for the parallel execution of imperative programs in a shared-memory environment. Anti-dependencies and output-dependencies arising from array references within loops are completely removed, using run-time analysis if necessary. A parallel reference-pattern generation scheme based on one proposed in [13] is used in conjunction with dynamic allocation and binding of storage, to completely remove non-intrinsic data dependencies during execution.