Squeezing the most out of an algorithm in CRAY FORTRAN
ACM Transactions on Mathematical Software (TOMS)
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
IBM Journal of Research and Development
Efficient vector and parallel manipulation of tensor products
ACM Transactions on Mathematical Software (TOMS)
Algorithm 753: TENPACK: a LAPACK-based library for the computer manipulation of tensor products
ACM Transactions on Mathematical Software (TOMS)
The design of MA48: a code for the direct solution of sparse unsymmetric linear systems of equations
ACM Transactions on Mathematical Software (TOMS)
Practical experience in the numerical dangers of heterogeneous computing
ACM Transactions on Mathematical Software (TOMS)
Efficient Sparse LU Factorization with Partial Pivoting on Distributed Memory Architectures
IEEE Transactions on Parallel and Distributed Systems
Elimination forest guided 2D sparse LU factorization
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The automatic generation of sparse primitives
ACM Transactions on Mathematical Software (TOMS)
Restructuring the BLAS level 1 routine for computing the modified givens transformation
ACM SIGNUM Newsletter
Space/time-efficient scheduling and execution of parallel irregular computations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Portable and efficient factorization algorithms on the IBM 3090/VF
ICS '89 Proceedings of the 3rd international conference on Supercomputing
OoLALA: an object oriented analysis and design of numerical linear algebra
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Sparse LU factorization with partial pivoting on distributed memory machines
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Renovating the collected algorithms from ACM
ACM Transactions on Mathematical Software (TOMS)
Preface to the special issue on the basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
Generic programming for high performance scientific applications
JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
The Matrix Template Library: Generic Components for High-Performance Scientific Computing
Computing in Science and Engineering
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
An Evaluation of Java for Numerical Computing
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Blocking Techniques in Numerical Software
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Continuous program optimization: A case study
ACM Transactions on Programming Languages and Systems (TOPLAS)
Self-adapting software for numerical linear algebra and LAPACK for clusters
Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Journal of Computational and Applied Mathematics
Parallel sparse LU factorization on different message passing platforms
Journal of Parallel and Distributed Computing
An evaluation of Java for numerical computing
Scientific Programming
ACM Transactions on Mathematical Software (TOMS)
High Performance Development for High End Computing With Python Language Wrapper (PLW)
International Journal of High Performance Computing Applications
FastScat™: An Object-Oriented Program for Fast Scattering Computation
Scientific Programming - The First Annual Object-Oriented Numerics Conference (OON-SKI '93)
High Performance Implementation of Binomial Option Pricing
ICCSA '08 Proceeding sof the international conference on Computational Science and Its Applications, Part I
A unified model for multicore architectures
IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
C++ Bindings to External Software Libraries with Examples from BLAS, LAPACK, UMFPACK, and MUMPS
ACM Transactions on Mathematical Software (TOMS)
Evaluating multicore algorithms on the unified memory model
Scientific Programming - Software Development for Multi-core Computing Systems
Scaling LAPACK panel operations using parallel cache assignment
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Self-adapting software for numerical linear algebra library routines on clusters
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
A supernodal out-of-core sparse Gaussian-elimination method
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Parallelization of general matrix multiply routines using OpenMP
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
Scaling LAPACK panel operations using parallel cache assignment
ACM Transactions on Mathematical Software (TOMS)
Cache efficient implementation for block matrix operations
Proceedings of the High Performance Computing Symposium
Hi-index | 0.00 |
This paper describes a model implementation and test software for the Level 2 Basic Linear Algebra Subprograms (Level 2 BLAS). Level 2 BLAS are targeted at matrix-vector operations with the aim of providing more efficient, but portable, implementations of algorithms on high-performance computers. The model implementation provides a portable set of FORTRAN 77 Level 2 BLAS for machines where specialized implementations do not exist or are not required. The test software aims to verify that specialized implementations meet the specification of Level 2 BLAS and that implementations are correctly installed.