Solving unsymmetric sparse systems of linear equations with PARDISO

Authors:
Olaf Schenk;Klaus Gärtner
Affiliations:
Department of Computer Science, University of Basel, Klingelbergstrasse 50, CH-4056 Basel, Switzerland and IBM Research Division, T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY;Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstr. 39, D-10117 Berlin, Germany
Venue:
Future Generation Computer Systems - Special issue: Selected numerical algorithms
Year:
2004

Citing 16
Cited 71

CGS, a fast Lanczos-type solver for nonsymmetric linear systems

SIAM Journal on Scientific and Statistical Computing
The influence of relaxed supernode partitions on the multifrontal method

ACM Transactions on Mathematical Software (TOMS)
Block sparse Cholesky algorithms on advanced uniprocessor computers

SIAM Journal on Scientific Computing
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms

IBM Journal of Research and Development
Modification of the minimum-degree algorithm by multiple elimination

ACM Transactions on Mathematical Software (TOMS)
An Approximate Minimum Degree Ordering Algorithm

SIAM Journal on Matrix Analysis and Applications
Fast and effective algorithms for graph partitioning and sparse-matrix ordering

IBM Journal of Research and Development - Special issue: optical lithography I
An Unsymmetric-Pattern Multifrontal Method for Sparse LU Factorization

SIAM Journal on Matrix Analysis and Applications
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
A Supernodal Approach to Sparse Partial Pivoting

SIAM Journal on Matrix Analysis and Applications
The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices

SIAM Journal on Matrix Analysis and Applications
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination

SIAM Journal on Matrix Analysis and Applications
Analysis and comparison of two general sparse solvers for distributed memory computers

ACM Transactions on Mathematical Software (TOMS)
Recent advances in direct methods for solving unsymmetric sparse systems of linear equations

ACM Transactions on Mathematical Software (TOMS)
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling

SIAM Journal on Matrix Analysis and Applications
Two-level dynamic scheduling in PARDISO: improved scalability on shared memory multiprocessing systems

Parallel Computing - Parallel matrix algorithms and applications

Parallel sparse LU factorization on second-class message passing platforms

Proceedings of the 19th annual international conference on Supercomputing
Successive pad assignment algorithm to optimize number and location of power supply pad using incremental matrix inversion

Proceedings of the 2005 Asia and South Pacific Design Automation Conference
Discrete quadratic curvature energies

ACM SIGGRAPH 2006 Courses
Parallel sparse LU factorization on different message passing platforms

Journal of Parallel and Distributed Computing
A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations

ACM Transactions on Mathematical Software (TOMS)
Experiences of sparse direct symmetric solvers

ACM Transactions on Mathematical Software (TOMS)
Discrete quadratic curvature energies

Computer Aided Geometric Design
PyTrilinos: High-performance distributed-memory solvers for Python

ACM Transactions on Mathematical Software (TOMS)
On the design of interfaces to sparse direct solvers

ACM Transactions on Mathematical Software (TOMS)
On the solution of the checkerboard problem in mixed-FEM topology optimization

Computers and Structures
Algorithmic performance studies on graphics processing units

Journal of Parallel and Distributed Computing
A multi-level parallel simulation approach to electron transport in nano-scale transistors

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Parallel Sparse Linear Solver for Nearest-Neighbor Tight-Binding Problems

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A numerical evaluation of preprocessing and ILU-type preconditioners for the solution of unsymmetric sparse linear systems using iterative methods

ACM Transactions on Mathematical Software (TOMS)
On the elliptic mesh generation in domains containing multiple inclusions and undergoing large deformations

Journal of Computational Physics
Perspective on the geometric conservation law and finite element methods for ALE simulations of incompressible flow

Journal of Computational Physics
Optimizing content-preserving projections for wide-angle images
A Mixed DG Method for Linearized Incompressible Magnetohydrodynamics

Journal of Scientific Computing
Face/Off: live facial puppetry

Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
Fast Implicit Simulation of Oscillatory Flow in Human Abdominal Bifurcation Using a Schur Complement Preconditioner

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
PSPIKE: A Parallel Hybrid Sparse Linear System Solver

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A parallel preconditioning strategy for efficient transistor-level circuit simulation

Proceedings of the 2009 International Conference on Computer-Aided Design
Global correspondence optimization for non-rigid registration of depth scans

SGP '08 Proceedings of the Symposium on Geometry Processing
Fast experimental estimation of drag coefficients of arbitrary structures

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Numerical strategies towards peta-scale simulations of nanoelectronics devices

Parallel Computing
AMESOS: a set of general interfaces to sparse direct solver libraries

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Image warps for artistic perspective manipulation

ACM SIGGRAPH 2010 papers
Fill-ins number reducing direct solver designed for FIT-type matrix

Mathematics and Computers in Simulation
A tearing-based hybrid parallel sparse linear system solver

Journal of Computational and Applied Mathematics
Efficient algebraic multigrid for migration-diffusion-convection-reaction systems arising in electrochemical simulations

Journal of Computational Physics
A Parallel Implementation of Electron-Phonon Scattering in Nanoelectronic Devices up to 95k Cores

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Sparse non-linear least squares optimization for geometric vision

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Face image relighting using locally constrained global optimization

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Solving Very Sparse Rational Systems of Equations

ACM Transactions on Mathematical Software (TOMS)
Augmented Lagrangian and penalty methods for the simulation of two-phase flows interacting with moving solids. Application to hydroplaning flows interacting with real tire tread patterns

Journal of Computational Physics
The Equivalence of Standard and Mixed Finite Element Methods in Applications to Elasto-Acoustic Interaction

SIAM Journal on Scientific Computing
Exploiting thread-level parallelism in the iterative solution of sparse linear systems

Parallel Computing
A globally convergent semi-smooth Newton method for control-state constrained DAE optimal control problems

Computational Optimization and Applications
Performance models for the Spike banded linear system solver

Scientific Programming
TexToons: practical texture mapping for hand-drawn cartoon animations

Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Non-Photorealistic Animation and Rendering
Are upwind techniques in multi-phase flow models necessary?

Journal of Computational Physics
A domain-decomposing parallel sparse linear system solver

Journal of Computational and Applied Mathematics
A threaded SPIKE algorithm for solving general banded systems

Parallel Computing
A Krylov Method for the Delay Eigenvalue Problem

SIAM Journal on Scientific Computing
Design of a Multicore Sparse Cholesky Factorization Using DAGs

SIAM Journal on Scientific Computing
Developing a dynamic model of cascading failure for high performance computing using trilinos

Proceedings of the first international workshop on High performance computing, networking and analytics for the power grid
Parametric medial shape representation in 3-d via the poisson partial differential equation with non-linear boundary conditions

IPMI'05 Proceedings of the 19th international conference on Information Processing in Medical Imaging
Oblio: design and performance

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Energy-based self-collision culling for arbitrary mesh deformations

ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
Boundary concentrated finite elements for optimal boundary control problems of elliptic PDEs

Computational Optimization and Applications
Dynamic state and parameter estimation applied to neuromorphic systems

Neural Computation
A stable discontinuous Galerkin-type method for solving efficiently Helmholtz problems

Computers and Structures
A stiffly accurate Rosenbrock-type method of order 2 applied to FE-analyses in finite strain viscoelasticity

Applied Numerical Mathematics
Original article: Comparison of numerical models in radiative heat transfer with application to circuit-breaker simulations

Mathematics and Computers in Simulation
Pointwise estimates of the SDFEM for convection-diffusion problems with characteristic layers

Applied Numerical Mathematics
Parallel implementations of the trace minimization scheme TraceMIN for the sparse symmetric eigenvalue problem

Computers & Mathematics with Applications
A SIMPLE based discontinuous Galerkin solver for steady incompressible flows

Journal of Computational Physics
Efficient Numerical Methods for Strongly Anisotropic Elliptic Equations

Journal of Scientific Computing
An improved strain gradient plasticity formulation with energetic interfaces: theory and a fully implicit finite element formulation

Computational Mechanics
Two-dimensional simulation of the fluttering instability using a pseudospectral method with volume penalization

Computers and Structures
Project APhiD: A Lorenz-gauged A-Φ decomposition for parallelized computation of ultra-broadband electromagnetic induction in a fully heterogeneous Earth

Computers & Geosciences
Scalable domain decomposition preconditioners for heterogeneous elliptic problems

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Parallel design and performance of nested filtering factorization preconditioner

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Pivoting strategies for tough sparse indefinite systems

ACM Transactions on Mathematical Software (TOMS)
Fast methods for computing selected elements of the green's function in massively parallel nanoelectronic device simulations

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Nonzero pattern analysis and memory access optimization in GPU-based sparse LU factorization for circuit simulation

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
A Lagrangian VOF tensorial penalty method for the DNS of resolved particle-laden flows

Journal of Computational Physics
Two-phase flow with mass density contrast: Stable schemes for a thermodynamic consistent and frame-indifferent diffuse-interface model

Journal of Computational Physics
An object-oriented framework for finite element analysis based on a compact topological data structure

Advances in Engineering Software
Computational homogenisation of composite plates: Consideration of the thickness change with a modified projection strategy

Computers & Mathematics with Applications
Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems

Scientific Programming

Quantified Score

Hi-index	0.04

Visualization

Abstract

Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matrix entries on the diagonal. Complete block diagonal supernode pivoting allows dynamical interchanges of columns and rows during the factorization process. The level-3 BLAS efficiency is retained and an advanced two-level left-right looking scheduling scheme results in good speedup on SMP machines. These algorithms have been integrated into the recent unsymmetric version of the PARDISO solver. Experiments demonstrate that a wide set of unsymmetric linear systems can be solved and high performance is consistently achieved for large sparse unsymmetric matrices from real world applications.