Maximizing sparse matrix-vector product performance on RISC based MIMD computers
Journal of Parallel and Distributed Computing
Renumbering unstructured grids to improve the performance of codes on hierarchical memory machines
Advances in Engineering Software
Parallel 3D computation of unsteady flows around circular cylinders
Parallel Computing - Special issue on applications: parallel computing methods in applied fluid mechanics
Improving performance of sparse matrix-vector multiplication
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Parallelization of a Dynamic Unstructured Algorithm Using Three Leading Programming Paradigms
IEEE Transactions on Parallel and Distributed Systems
Performance modeling and tuning of an unstructured mesh CFD application
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Performance optimizations and bounds for sparse matrix-vector multiply
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Reducing the bandwidth of sparse symmetric matrices
ACM '69 Proceedings of the 1969 24th national conference
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Edge-based interface elements for solution of three-dimensional geomechanical problems
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Parallel edge-based inexact newton solution of steady incompressible 3D navier-stokes equations
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
A new and simple method is proposed to choose the best data configuration in terms of processing phase time according to previous probing of edge-based matrix-vector products for codes using iterative solvers in unstructured grid problems. This method is realized as a suite of routines named EdgePack, acting during both pre-solution and solution phase, based on data locality optimization techniques and variations of matrix-vector product algorithm. Results have been demonstrating the great flexibility and simplicity of this method, which is suitable for distributed memory platforms in which different data configurations can coexist.