Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Detecting conflicts between structure accesses
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Commutativity-Based Concurrency Control for Abstract Data Types
IEEE Transactions on Computers
Lazy task creation: a technique for increasing the granularity of parallel programs
LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
Analysis of pointers and structures
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Making asynchronous parallelism safe for the world
POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Program optimization and parallelization using idioms
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM
The design and analysis of DASH: a scalable directory-based multiprocessor
The design and analysis of DASH: a scalable directory-based multiprocessor
Eliminating false data dependences using the Omega test
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Parallel hierarchical N-body methods
Parallel hierarchical N-body methods
Parallel hierarchical N-body methods and their implications for multiprocessors
Parallel hierarchical N-body methods and their implications for multiprocessors
Parallelizing complex scans and reductions
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Context-sensitive interprocedural points-to analysis in the presence of function pointers
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Efficient context-sensitive pointer analysis for C programs
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Software caching and computation migration in Olden
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Flattening and parallelizing irregular, recurrent loop nests
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Communication optimizations for parallel computing using data access information
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Experience with processes and monitors in Mesa
Communications of the ACM
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs
IEEE Transactions on Parallel and Distributed Systems
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Recognizing and Parallelizing Bounded Recurrences
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Analysis of Dynamic Structures for Efficient Parallel Execution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
On the Complexity of Commutativity Analysis
COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Commutativity Analysis: A Technique for Automatically Parallelizing Pointer-Based Computations
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Synchronization transformations for parallel computing
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Detecting data races in Cilk programs that use locks
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
ICS '98 Proceedings of the 12th international conference on Supercomputing
Locality Analysis for Parallel C Programs
IEEE Transactions on Parallel and Distributed Systems
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Matching and searching analysis for parallel hardware implementation on FPGAs
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Using types to analyze and optimize object-oriented programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
High-level Language Support for User-defined Reductions
The Journal of Supercomputing
Automatic Parallelization of Recursive Procedures
International Journal of Parallel Programming
Parallelizing graph construction operations in programs with cyclic graphs
Parallel Computing
Identifying parallelism in programs with cyclic graphs
Journal of Parallel and Distributed Computing
Proving optimizations correct using parameterized program equivalence
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Identifying static analysis techniques for finding non-fix hunks in fix revisions
Proceedings of the ACM first international workshop on Data-intensive software management and mining
Commutative set: a language extension for implicit parallel programming
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
NDSeq: runtime checking for nondeterministic sequential specifications of parallel correctness
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
HAWKEYE: effective discovery of dataflow impediments to parallelization
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
A universal calculus for stream processing languages
ESOP'10 Proceedings of the 19th European conference on Programming Languages and Systems
Effective straggler mitigation: attack of the clones
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
GRASS: trimming stragglers in approximation analytics
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
This paper presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granularity to discover when operations commute (i.e. generate the same final result regardless of the order in which they execute). If all of the operations required to perform a given computation commute, the compiler can automatically generate parallel code.We have implemented a prototype compilation system that uses commutativity analysis as its primary analysis framework. We have used this system to automatically parallelize two complete scientific computations: the Barnes-Hut N-body solver and the Water code. This paper presents performance results for the generated parallel code running on the Stanford DASH machine. These results provide encouraging evidence that commutativity analysis can serve as the basis for a successful parallelizing compiler.