A flexible inner-outer preconditioned GMRES algorithm
SIAM Journal on Scientific Computing
Reducing synchronization on the parallel Davidson method for the large sparse, eigenvalue problem
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A Jacobi--Davidson Iteration Method for Linear EigenvalueProblems
SIAM Journal on Matrix Analysis and Applications
The Legion vision of a worldwide virtual computer
Communications of the ACM
The symmetric eigenvalue problem
The symmetric eigenvalue problem
A parallel algorithm for multilevel graph partitioning and sparse matrix ordering
Journal of Parallel and Distributed Computing
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Future Generation Computer Systems - Special issue on metacomputing
Parallel preconditioning of a sparse eigensolver
Parallel Computing - Linear systems and associated problems
Scalable Parallel Computing: Technology,Architecture,Programming
Scalable Parallel Computing: Technology,Architecture,Programming
Numerical Linear Algebra for High Performance Computers
Numerical Linear Algebra for High Performance Computers
A Resource Query Interface for Network-Aware Applications
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Comparing Passive Network Monitoring of Grid Application Traffic with Active Probes
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Parallel, multigrain iterative solvers for hiding network latencies on MPPs and networks of clusters
Parallel Computing - Parallel matrix algorithms and applications (PMAA '02)
Hi-index | 0.00 |
Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in networks of clusters and Grids can impede the performance of applications not specifically tailored for use in such environments. A typical example is the traditional fine grain implementations of Krylov-like iterative methods, a central component in many scientific applications. To exploit the potential of these environments, advances in networking technology must be complemented by advances in parallel algorithmic design. In this paper, we present an algorithmic technique that increases the granularity of parallel, block iterative methods by inducing additional work during the preconditioning (inexact solution) phase of theiteration. During this phase, each vector in the block is preconditioned by a different subgroup of processors, yielding a much coarser granularity. The rest of the method comprises a small portion of the total time and is still implemented in fine grain. We call this combination of fine and coarse grain parallelism multigrain. We apply this idea to the block Jacobi-Davidson eigensolver, and present experimental data that shows the significant reduction of latencyeffects on networks of clusters of roughly equal capacity and size. We conclude with a discussion on how multigrain can be applied dynamically based on runtime network performance monitoring.