Detecting conflicts between structure accesses
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Guaranteed-quality mesh generation for curved surfaces
SCG '93 Proceedings of the ninth annual symposium on Computational geometry
Runtime compilation techniques for data partitioning and communication schedule reuse
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Putting pointer analysis to work
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Parametric shape analysis via 3-valued logic
Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
IEEE Transactions on Parallel and Distributed Systems
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Parallelizing Programs with Recursive Data Structures
IEEE Transactions on Parallel and Distributed Systems
Transactional Memory (Synthesis Lectures on Computer Architecture)
Transactional Memory (Synthesis Lectures on Computer Architecture)
Open nesting in software transactional memory
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic parallelism requires abstractions
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Algorithm 872: Parallel 2D constrained Delaunay mesh generation
ACM Transactions on Mathematical Software (TOMS)
Optimistic parallelism benefits from data partitioning
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Three-dimensional delaunay refinement for multi-core processors
Proceedings of the 22nd annual international conference on Supercomputing
Scheduling strategies for optimistic parallel execution of irregular programs
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Isolation for nested task parallelism
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Hi-index | 0.00 |
Irregular applications, i.e. , programs that manipulate pointer-based data structures such as graphs and trees, constitute a challenging target for parallelization because the amount of parallelism is input dependent and changes dynamically. Traditional dependence analysis techniques are too conservative to expose this parallelism. Even manual parallelization is difficult, time consuming, and error prone. The Galois system parallelizes such applications using an optimistic approach that exploits higher-level semantics of abstract data types. In this paper, we study the performance and scalability of a Galoised, that is, automatically parallelized, version of Delaunay mesh refinement (DR) on a shared-memory system with 128 CPUs. DR is an important irregular application that is used, e.g. , in graphics and finite-element codes. The parallelized program scales to 64 threads, where it reaches a speedup of 25.8. For large numbers of threads, the performance is hampered by the load imbalance and the nonuniform memory latency, both of which grow as the number of threads increases. While these two issues will have to be addressed in future work, we believe our results already show the Galois approach to be very promising.