A multigrain Delaunay mesh generation method for multicore SMT-based architectures

Authors:
Christos D. Antonopoulos;Filip Blagojevic;Andrey N. Chernikov;Nikos P. Chrisochoides;Dimitrios S. Nikolopoulos
Affiliations:
Department of Computer and Communications Engineering, University of Thessaly, Volos, Greece;Lawrence Berkeley National Lab, Berkeley, CA 94720, United States;Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, United States;Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, United States;Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
Venue:
Journal of Parallel and Distributed Computing
Year:
2009

Citing 28
Cited 4

A Delaunay refinement algorithm for quality 2-dimensional mesh generation

SODA '93 Selected papers from the fourth annual ACM SIAM symposium on Discrete algorithms
PARMESH—a parallel mesh generator

Parallel Computing
On an automatically parallel generation technique for tetrahedral meshes

Parallel Computing
Simultaneous multithreading: maximizing on-chip parallelism

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Delaunay based numerical method for three dimensions: generation, formulation, and partition

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Developing a practical projection-based parallel Delaunay algorithm

Proceedings of the twelfth annual symposium on Computational geometry
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Parallelization of a dynamic unstructured application using three leading paradigms

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
Guaranteed: quality parallel delaunay refinement for restricted polyhedral domains

Proceedings of the eighteenth annual symposium on Computational geometry
Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator

FCRC '96/WACG '96 Selected papers from the Workshop on Applied Computational Geormetry, Towards Geometric Engineering
Dissecting Cyclops: a detailed analysis of a multithreaded architecture

ACM SIGARCH Computer Architecture News
Computing the medial axis of a polyhedron reliably and efficiently

Computing the medial axis of a polyhedron reliably and efficiently
A Load Balancing Framework for Adaptive and Asynchronous Applications

IEEE Transactions on Parallel and Distributed Systems
Practical and efficient point insertion scheduling method for parallel guaranteed quality delaunay refinement

Proceedings of the 18th annual international conference on Supercomputing
Guaranteed-quality parallel Delaunay refinement for restricted polyhedral domains

Computational Geometry: Theory and Applications - Special issue on the 18th annual symposium on computational geometry—SoCG2002
Programming with transactional coherence and consistency (TCC)

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Flow Past a Stationary and Moving Cylinder: DNS at Re=10,000

DOD_UGC '04 Proceedings of the 2004 Users Group Conference
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architectures

Proceedings of the 19th annual international conference on Supercomputing
Delaunay Decoupling Method for Parallel Guaranteed Quality Planar Mesh Refinement

SIAM Journal on Scientific Computing
Toward real-time image guided neurosurgery using distributed and grid computing

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Guaranteed Quality Delaunay Uniform Mesh Refinement

SIAM Journal on Scientific Computing
Algorithm 870: A static geometric Medial Axis domain decomposition in 2D Euclidean space

ACM Transactions on Mathematical Software (TOMS)
Algorithm 872: Parallel 2D constrained Delaunay mesh generation

ACM Transactions on Mathematical Software (TOMS)
IBM Power5 Chip: A Dual-Core Multithreaded Processor

IEEE Micro
Algorithm, software, and hardware optimizations for Delaunay mesh generation on simultaneous multithreaded architectures

Journal of Parallel and Distributed Computing
Delaunay refinement algorithms for triangular mesh generation

Computational Geometry: Theory and Applications

Algorithm, software, and hardware optimizations for Delaunay mesh generation on simultaneous multithreaded architectures

Journal of Parallel and Distributed Computing
Parallel geometric algorithms for multi-core computers

Computational Geometry: Theory and Applications
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Multithread parallelization of Lepp-bisection algorithms

Applied Numerical Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given the proliferation of layered, multicore- and SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the element level. This multi-grain data parallel approach targets clusters built from commercially available SMTs and multicore processors. The exploitation of the coarser degree of granularity facilitates scalability both in terms of execution time and problem size on loosely-coupled clusters. The exploitation of medium-grain parallelism allows performance improvement at the single node level. Our experimental evaluation shows that the first generation of SMT cores is not capable of taking advantage of fine-grain parallelism in PCDM. Many of our experimental findings with PCDM extend to other adaptive and irregular multigrain parallel algorithms as well.