On the storage requirement in the out-of-core multifrontal method for sparse factorization
ACM Transactions on Mathematical Software (TOMS)
The role of elimination trees in sparse factorization
SIAM Journal on Matrix Analysis and Applications
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Efficient Methods for Out-of-Core Sparse Cholesky Factorization
SIAM Journal on Scientific Computing
The Multifrontal Solution of Indefinite Sparse Symmetric Linear
ACM Transactions on Mathematical Software (TOMS)
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
External memory algorithms for factoring sparse matrices
External memory algorithms for factoring sparse matrices
The design and implementation of a new out-of-core sparse cholesky factorization method
ACM Transactions on Mathematical Software (TOMS)
Task Scheduling in an Asynchronous Distributed Memory Multifrontal Solver
SIAM Journal on Matrix Analysis and Applications
Hybrid scheduling for the parallel solution of linear systems
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
A preliminary out-of-core extension of a parallel multifrontal solver
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Reducing the I/O Volume in Sparse Out-of-core Multifrontal Methods
SIAM Journal on Scientific Computing
Hi-index | 0.01 |
The memory usage of sparse direct solvers can be the bottleneck to solve large sparse systems of linear equations of the form Ax=b. In order to solve large problems, we have designed a robust out-of-core solver, in which computed factors are stored on disk. We use large real-life problems (up to several million equations and several hundred million nonzeros) to show that we can significantly reduce the core memory usage in parallel (on up to 128 processors), with a time performance comparable to that of a parallel in-core solver. A careful study shows how the low-level I/O mechanisms impact the performance. We describe a low-level I/O layer that avoids the perturbations introduced by system buffers and allows consistently good performance results. To go significantly further in the memory reduction, it is interesting to also store the intermediate working memory on disk. In this paper we describe algorithmic models to address this issue, and study their potential in terms of both memory requirements and I/O volume. The out-of-core solver discussed in this paper is publicly available and already used by several academic and industrial groups. The results of the algorithmic modelling will be the basis to design a new version of this solver; this work may also be a useful reference for other developers of sparse out-of-core solvers.