GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
Direct methods for sparse matrices
Direct methods for sparse matrices
Symbolic factorization for sparse Gaussian elimination with partial pivoting
SIAM Journal on Scientific and Statistical Computing
A data structure for sparse QR and LU factorizations
SIAM Journal on Scientific and Statistical Computing - Telecommunication Programs at U.S. Universities
The role of elimination trees in sparse factorization
SIAM Journal on Matrix Analysis and Applications
Elimination structures for unsymmetric sparse LU factors
SIAM Journal on Matrix Analysis and Applications
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Scalable iterative solution of sparse linear systems
Parallel Computing
Predicting Structure in Sparse Matrix Computations
SIAM Journal on Matrix Analysis and Applications
Modification of the minimum-degree algorithm by multiple elimination
ACM Transactions on Mathematical Software (TOMS)
SIAM Journal on Scientific Computing
An Approximate Minimum Degree Ordering Algorithm
SIAM Journal on Matrix Analysis and Applications
Matrix computations (3rd ed.)
Highly Scalable Parallel Algorithms for Sparse Matrix Factorization
IEEE Transactions on Parallel and Distributed Systems
Applied numerical linear algebra
Applied numerical linear algebra
ScaLAPACK user's guide
Efficient Sparse LU Factorization with Partial Pivoting on Distributed Memory Architectures
IEEE Transactions on Parallel and Distributed Systems
A combined unifrontal/multifrontal method for unsymmetric sparse matrices
ACM Transactions on Mathematical Software (TOMS)
Robust Ordering of Sparse Matrices using Multisection
SIAM Journal on Matrix Analysis and Applications
A Supernodal Approach to Sparse Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices
SIAM Journal on Matrix Analysis and Applications
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination
SIAM Journal on Matrix Analysis and Applications
Analysis and comparison of two general sparse solvers for distributed memory computers
ACM Transactions on Mathematical Software (TOMS)
Making sparse Gaussian elimination scalable by static pivoting
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Design, implementation and testing of extended and mixed precision BLAS
ACM Transactions on Mathematical Software (TOMS)
Computer Solution of Large Sparse Positive Definite
Computer Solution of Large Sparse Positive Definite
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Preconditioning Highly Indefinite and Nonsymmetric Matrices
SIAM Journal on Scientific Computing
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling
SIAM Journal on Matrix Analysis and Applications
An Unsymmetrized Multifrontal LU Factorization
SIAM Journal on Matrix Analysis and Applications
A Mapping and Scheduling Algorithm for Parallel Sparse Fan-In Numerical Factorization
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Preconditioning sparse matrices for computing eigenvalues and solving linear systems of equations
Preconditioning sparse matrices for computing eigenvalues and solving linear systems of equations
Parallelization of Direct Algorithms using Multisplitting Methods in Grid Environments
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
Parallel sparse LU factorization on second-class message passing platforms
Proceedings of the 19th annual international conference on Supercomputing
An overview of SuperLU: Algorithms, implementation, and user interface
ACM Transactions on Mathematical Software (TOMS) - Special issue on the Advanced CompuTational Software (ACTS) Collection
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Error bounds from extra-precise iterative refinement
ACM Transactions on Mathematical Software (TOMS)
Making a Supercomputer Do What You Want: High-Level Tools for Parallel Programming
Computing in Science and Engineering
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel sparse LU factorization on different message passing platforms
Journal of Parallel and Distributed Computing
Parallel unsymmetric-pattern multifrontal sparse LU with column preordering
ACM Transactions on Mathematical Software (TOMS)
Scaling performance of interior-point method on large-scale chip multiprocessor system
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Multi-threading and one-sided communication in parallel LU factorization
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
GREMLINS: a large sparse linear solver for grid environment
Parallel Computing
ACM Transactions on Mathematical Software (TOMS)
A multi-level parallel simulation approach to electron transport in nano-scale transistors
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Dendro: parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Parallel Sparse Linear Solver for Nearest-Neighbor Tight-Binding Problems
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
International Journal of Computational Fluid Dynamics - Mesoscopic Methods And Their Applications To CFD
A sparse nonsymmetric eigensolver for distributed memory architectures
International Journal of Parallel, Emergent and Distributed Systems
Computational complexity and parallelization of the meshless local Petrov-Galerkin method
Computers and Structures
Evaluation of Sparse LU Factorization and Triangular Solution on Multicore Platforms
High Performance Computing for Computational Science - VECPAR 2008
The design and use of a sparse direct solver for skew symmetric matrices
Journal of Computational and Applied Mathematics
Parallel solution of the chemical master equation
SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
A parallel preconditioning strategy for efficient transistor-level circuit simulation
Proceedings of the 2009 International Conference on Computer-Aided Design
International Journal of High Performance Computing Applications
Journal of Computational Physics
Analysis of a mixed semi-implicit/implicit algorithm for low-frequency two-fluid plasma modeling
Journal of Computational Physics
BCYCLIC: A parallel block tridiagonal matrix cyclic solver
Journal of Computational Physics
Fast algorithms for placing large entries along the diagonal of a sparse matrix
Journal of Computational and Applied Mathematics
A Parallel Implementation of Electron-Phonon Scattering in Nanoelectronic Devices up to 95k Cores
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A Parallel Geometric Multigrid Method for Finite Elements on Octree Meshes
SIAM Journal on Scientific Computing
On techniques to improve robustness and scalability of a parallel hybrid linear solver
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A domain-decomposing parallel sparse linear system solver
Journal of Computational and Applied Mathematics
Atomistic nanoelectronic device engineering with sustained performances up to 1.44 PFlop/s
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Sparse triangular solves for ILU revisited: data layout crucial to better performance
International Journal of High Performance Computing Applications
Hypergraph-Based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization
SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
3POr: parallel projection based parameterized order reduction for multi-dimensional linear models
Proceedings of the International Conference on Computer-Aided Design
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
A shared- and distributed-memory parallel sparse direct solver
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Parallel treatment of general sparse matrices
LSSC'05 Proceedings of the 5th international conference on Large-Scale Scientific Computing
Sparse matrices in Matlab*P: design and implementation
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
A static parallel multifrontal solver for finite element meshes
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Sparse LU factorization for parallel circuit simulation on GPU
Proceedings of the 49th Annual Design Automation Conference
Journal of Parallel and Distributed Computing
A look inside the earth: geophysical imaging of the subsurface
XRDS: Crossroads, The ACM Magazine for Students - Scientific Computing
A Galerkin least squares method for time harmonic Maxwell equations using Nédélec elements
Journal of Computational Physics
Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems
Scientific Programming
Hi-index | 0.02 |
We present the main algorithmic features in the software package SuperLU_DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software's parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication patterns, which lets us exploit techniques used in parallel sparse Cholesky algorithms to better parallelize both LU decomposition and triangular solution on large-scale distributed machines.