Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
A Scheme to Enforce Data Dependence on Large Multiprocessor Systems
IEEE Transactions on Software Engineering
Automatic decomposition of scientific programs for parallel execution
POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Improving the performance of virtual memory computers.
Improving the performance of virtual memory computers.
Speedup of ordinary programs
Multiprocessors: discussion of some theoretical and practical problems
Multiprocessors: discussion of some theoretical and practical problems
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Compile-time scheduling and optimization for asynchronous machines (multiprocessor, compiler, parallel processing)
On program restructuring, scheduling, and communication for parallel processor systems
On program restructuring, scheduling, and communication for parallel processor systems
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
High-Speed Multiprocessors and Compilation Techniques
IEEE Transactions on Computers
An approach to synchronization for parallel computing
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Impact of self-scheduling order on performance on multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Synchronization using counting semaphores
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Restructuring Lisp programs for concurrent execution
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
On data synchronization for multiprocessors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Parallel MIMD programming for global models of atmospheric flow
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Event synchronization analysis for debugging parallel programs
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Compiler-Assisted Synthesis of Algorithm-Based Checking in Multiprocessors
IEEE Transactions on Computers
Language support for a semi-dataflow parallel programming environment
ACM SIGPLAN Notices
Efficient Doacross execution on distributed shared-memory multiprocessors
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler algorithms for event variable synchronization
ICS '91 Proceedings of the 5th international conference on Supercomputing
Removal of redundant dependences in DOACROSS loops with constant dependences
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Loop displacement: an approach for transforming and scheduling loops for parallel execution
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Access normalization: loop restructuring for NUMA compilers
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
An effective synchronization network for hot-spot accesses
ACM Transactions on Computer Systems (TOCS)
Access normalization: loop restructuring for NUMA computers
ACM Transactions on Computer Systems (TOCS)
Advanced compiler optimizations for sparse computations
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
ICS '94 Proceedings of the 8th international conference on Supercomputing
Distributed Hardwired Barrier Synchronization for Scalable Multiprocessor Clusters
IEEE Transactions on Parallel and Distributed Systems
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Extracting task-level parallelism
ACM Transactions on Programming Languages and Systems (TOPLAS)
Run-time methods for parallelizing partially parallel loops
ICS '95 Proceedings of the 9th international conference on Supercomputing
On Effective Execution of Nonuniform DOACROSS Loops
IEEE Transactions on Parallel and Distributed Systems
Static analysis to reduce synchronization costs in data-parallel programs
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler techniques for data synchronization in nested parallel loops
ICS '90 Proceedings of the 4th international conference on Supercomputing
A graph based approach to barrier synchronisation minimisation
ICS '97 Proceedings of the 11th international conference on Supercomputing
Synchronization transformations for parallel computing
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
IEEE Transactions on Parallel and Distributed Systems
Redundant Synchronization Elimination for DOACROSS Loops
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences
IEEE Transactions on Parallel and Distributed Systems
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Techniques for speculative run-time parallelization of loops
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Compile Time Barrier Synchronization Minimization
IEEE Transactions on Parallel and Distributed Systems
An efficient algorithm for the run-time parallelization of DOACROSS loops
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Exploiting Parallelism Across Program Execution: A Unification Technique and Its Analysis
IEEE Transactions on Parallel and Distributed Systems
Removal of Redundant Dependences in DOACROSS Loops with Constant Dependences
IEEE Transactions on Parallel and Distributed Systems
Automatic Extraction of Functional Parallelism from Ordinary Programs
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations
IEEE Transactions on Parallel and Distributed Systems
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs
IEEE Transactions on Parallel and Distributed Systems
Dependence Uniformization: A Loop Parallelization Technique
IEEE Transactions on Parallel and Distributed Systems
Time-Stamping Algorithms for Parallelization of Loops at Run-Time
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Analysis of Multithreaded Programs
SAS '01 Proceedings of the 8th International Symposium on Static Analysis
An Efficient Run-Time Scheme for Exploiting Parallelism on Multiprocessor Systems
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
An Efficient Technique of Instruction Scheduling on a Superscalar-Based Mulprocessor
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Automatic fence insertion for shared memory multiprocessing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The Illinois Aggressive Coma Multiprocessor project (I-ACOMA)
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Stack allocation and synchronization optimizations for Java using escape analysis
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic multithreading and multiprocessing of C programs for IXP
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
The Journal of Supercomputing
Lightweight lock-free synchronization methods for multithreading
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 34th annual international symposium on Computer architecture
Techniques for efficient placement of synchronization primitives
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Predecessor/successor approach for high-performance run-time wavefront scheduling
Information Sciences: an International Journal
An adaptive scheme for dynamic parallelization
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Using free scheduling for programming graphic cards
Facing the Multicore-Challenge II
Speculative parallelization: eliminating the overhead of failure
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
A practical approach to DOACROSS parallelization
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 14.98 |
Translating program loops into a parallel form is one of the most important transformations performed by concurrentizing compilers. This transformation often requires the insertion of synchronization instructions within the body of the concurrent loop. Several loop synchronization techniques are presented first. Compiler algorithms to generate synchronization instructions for singly-nested loops are then discussed. Finally, a technique for the elimination of redundant synchronization instructions is presented.