Distribution of mathematical software via electronic mail
Communications of the ACM
High-performance computer architecture
High-performance computer architecture
The WY representation for products of householder matrices
SIAM Journal on Scientific and Statistical Computing - Papers from the Second Conference on Parallel Processing for Scientific Computin
A fully parallel algorithm for the symmetric eigenvalue problem
SIAM Journal on Scientific and Statistical Computing
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
The algebraic eigenvalue problem
The algebraic eigenvalue problem
A storage-efficient WY representation for products of householder transformations
SIAM Journal on Scientific and Statistical Computing
Computing accurate eigensystems of scaled diagonally dominant matrices
SIAM Journal on Numerical Analysis
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Computer Architecture and Parallel Processing
Computer Architecture and Parallel Processing
Accurate eigenvalues of a symmetric tri-diagonal matrix
Accurate eigenvalues of a symmetric tri-diagonal matrix
Controlling and sequencing a heavily pipelined floating-point operator
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
ICS '95 Proceedings of the 9th international conference on Supercomputing
Distributed component architecture for scientific applications
CRPIT '02 Proceedings of the Fortieth International Conference on Tools Pacific: Objects for internet, mobile and embedded applications
The Matrix Template Library: Generic Components for High-Performance Scientific Computing
Computing in Science and Engineering
Parallel Factorizations with Algorithmic Blocking
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Compiling MATLAB Programs to ScaLAPACK: Exploiting Task and Data Parallelism
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
Heterogeneous Networks of Workstations and the Parallel Matrix Multiplication
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams
Proceedings of the 31st annual international symposium on Computer architecture
A Simulation and Decision Framework for Selection of Numerical Solvers in
ANSS '06 Proceedings of the 39th annual Symposium on Simulation
An operation stacking framework for large ensemble computations
Proceedings of the 21st annual international conference on Supercomputing
Effective and scalable software compatibility testing
ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines
Scientific Programming
High-performance technical computing with erlang
Proceedings of the 7th ACM SIGPLAN workshop on ERLANG
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
On the Need for a Consortium of Capability Centers
International Journal of High Performance Computing Applications
Automating the generation of composed linear algebra kernels
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Fast tridiagonal solvers on the GPU
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Operation Stacking for Ensemble Computations With Variable Convergence
International Journal of High Performance Computing Applications
Fundamenta Informaticae - Understanding Computers' Intelligence Celebrating the 100th Volume of Fundamenta Informaticae in Honour of Helena Rasiowa
Quadratic Programming Feature Selection
The Journal of Machine Learning Research
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Point Cloud Glue: constraining simulations using the procrustes transform
Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation
ULCC: a user-level facility for optimizing shared cache performance on multicores
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Numerical Python for scalable architectures
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
A parallel code for time independent quantum reactive scattering on CPU-GPU platforms
ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part III
Compiler-optimized kernels: an efficient alternative to hand-coded inner kernels
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Loop transformation recipes for code generation and auto-tuning
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
MadLINQ: large-scale distributed matrix computation for the cloud
Proceedings of the 7th ACM european conference on Computer Systems
Parallel programming: design of an overview class
Proceedings of the 2011 ACM SIGPLAN X10 Workshop
An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs
Proceedings of the 26th ACM international conference on Supercomputing
Modeling performance through memory-stalls
ACM SIGMETRICS Performance Evaluation Review
Parallelized matrix factorization for fast BTF compression
EG PGV'09 Proceedings of the 9th Eurographics conference on Parallel Graphics and Visualization
Decomposition and visualization of fourth-order elastic-plastic tensors
SPBG'08 Proceedings of the Fifth Eurographics / IEEE VGTC conference on Point-Based Graphics
Numprof: a performance analysis framework for numerical libraries
PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
Towards effective clustering techniques for the analysis of electric power grids
HiPCNA-PG '13 Proceedings of the 3rd International Workshop on High Performance Computing, Networking and Analytics for the Power Grid
Hi-index | 0.00 |
The goal of the LAPACK project is to design and implement a portable linear algebra library for efficient use on a variety of high-performance computers. The library is based on the widely used LINPACK and EISPACK packages for solving linear equations, eigenvalue problems, and linear least-squares problems, but extends their functionality in a number of ways. The major methodology for making the algorithms run faster is to restructure them to perform block matrix operations (e.g., matrix-matrix multiplication) in their inner loops. These block operations may be optimized to exploit the memory hierarchy of a specific architecture. The LAPACK project is also working on new algorithms that yield higher relative accuracy for a variety of linear algebra problems.