The numerical analysis of ordinary differential equations: Runge-Kutta and general linear methods
The numerical analysis of ordinary differential equations: Runge-Kutta and general linear methods
Parallel algorithms for initial-value problems for difference and differential equations
Journal of Computational and Applied Mathematics
Parallel iteration of high-order Runge-Kutta methods with stepsize control
Journal of Computational and Applied Mathematics
Solving ordinary differential equations I (2nd revised. ed.): nonstiff problems
Solving ordinary differential equations I (2nd revised. ed.): nonstiff problems
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
Grid computing services for parallel algorithms in medicine and biology
MCBC'08 Proceedings of the 9th WSEAS International Conference on Mathematics & Computers In Biology & Chemistry
Grid computing services for parallel algorithms in medicine and biology
WSEAS Transactions on Computers
Scalability and locality of extrapolation methods for distributed-memory architectures
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Hi-index | 0.01 |
We are interested in the efficient solution of linear second order Partial Differential Equation (PDE) problems on rectangular domains. The PDE discretisation scheme used is of Finite Element type and is based on quadratic splines and the collocation methodology. We integrate the Quadratic Spline Collocation (QSC) discretisation scheme with a Domain Decomposition (DD) technique. We develop DD motivated orderings of the QSC equations and unknowns and apply the Preconditioned Conjugate Gradient (PCG) method for the solution of the Schur Complement (SC) system. Our experiments show that the SC-PCG-QSC method in its sequential mode is very efficient compared to standard direct band solvers for the QSC equations. We have implemented the SC-PCG-QSC method on the iPSC/2 hypercube and present performance evaluation results for up to 32 processors configurations. We discuss a type of nearest neighbour communication scheme, in which the amount of data transfer per processor does not grow with the number of processors. The estimated efficiencies are at the order of 90%.