A Portable Fortran Program to Find the Euclidean Norm of a Vector
ACM Transactions on Mathematical Software (TOMS)
A Fortran Multiple-Precision Arithmetic Package
ACM Transactions on Mathematical Software (TOMS)
Clarification of Fortran standards—second report
Communications of the ACM
Basic Linear Algebra Subprograms for FORTRAN Usage
Basic Linear Algebra Subprograms for FORTRAN Usage
Procedures for optimization problems with a mixture of bounds and general linear constraints
ACM Transactions on Mathematical Software (TOMS)
Transforming FORTRAN DO loops to improve performance on vector architectures
ACM Transactions on Mathematical Software (TOMS)
A proposal for a set of level 3 basic linear algebra subprograms
ACM SIGNUM Newsletter
An extended set of FORTRAN basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Solution of large, dense symmetric generalized eigenvalue problems using secondary storage
ACM Transactions on Mathematical Software (TOMS)
Performance of various computers using standard linear equations software in a FORTRAN environment
ACM SIGARCH Computer Architecture News
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Engineering and scientific subroutine library for the IBM 3090 vector facility
IBM Systems Journal
Object-oriented programming for linear algebra
OOPSLA '89 Conference proceedings on Object-oriented programming systems, languages and applications
Interprocessor communication speed and performance in distributed-memory parallel processors
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Matrix multiplication on the connection machine
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Algorithm 676: ODRPACK: software for weighted orthogonal distance regression
ACM Transactions on Mathematical Software (TOMS)
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
Algorithm 686: FORTRAN subroutines for updating the QR decomposition
ACM Transactions on Mathematical Software (TOMS)
Program optimization and parallelization using idioms
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Sparse extensions to the FORTRAN Basic Linear Algebra Subprograms
ACM Transactions on Mathematical Software (TOMS)
Algorithm 692: Model implementation and test package for the Sparse Basic Linear Algebra Subprograms
ACM Transactions on Mathematical Software (TOMS)
LAPACK: a portable linear algebra library for high-performance computers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Hierarchical blocking and data flow analysis for numerical linear algebra
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The impact of memory organization on the performance of matrix multiplication
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Scan primitives for vector computers
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Size and access inference for data-parallel programs
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
A new approach for automatic parallelization of blocked linear Algebra computations
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
The K2 distributed memory parallel processor: architecture, compiler, and operating system
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
FORTRAN subroutines for general Toeplitz systems
ACM Transactions on Mathematical Software (TOMS)
An implementation of a divide and conquer algorithm for the unitary eigen problem
ACM Transactions on Mathematical Software (TOMS)
LSNNO, a FORTRAN subroutine for solving large-scale nonlinear network optimization problems
ACM Transactions on Mathematical Software (TOMS)
Performance of various computers using standard linear equations software
ACM SIGARCH Computer Architecture News
Evaluation of compiler generated parallel programs on three multicomputers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Automatic data mapping for distributed-memory parallel computers
ICS '92 Proceedings of the 6th international conference on Supercomputing
ACM Transactions on Mathematical Software (TOMS)
The role of APL and J in high-performance computation
APL '93 Proceedings of the international conference on APL
Toward parallel mathematical software for elliptic partial differential equations
ACM Transactions on Mathematical Software (TOMS)
Algorithm 728: FORTRAN subroutines for generating quadratic bilevel programming test problems
ACM Transactions on Mathematical Software (TOMS)
A parallel block implementation of Level-3 BLAS for MIMD vector processors
ACM Transactions on Mathematical Software (TOMS)
Program optimization and parallelization using idioms
ACM Transactions on Programming Languages and Systems (TOPLAS)
Conversion to Fortran 90: a case study
ACM Transactions on Mathematical Software (TOMS)
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms
IBM Journal of Research and Development
Algorithm 737: INTLIB—a portable Fortran 77 interval standard-function library
ACM Transactions on Mathematical Software (TOMS)
Algorithm 741: least-squares solution of a linear, bordered, block-diagonal system of equations
ACM Transactions on Mathematical Software (TOMS)
Fast floating-point processing in Common Lisp
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Algorithm 640: Efficient calculation of frequency response matrices from state space models
ACM Transactions on Mathematical Software (TOMS) - The MIT Press scientific computation series
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
The design of a new frontal code for solving sparse, unsymmetric systems
ACM Transactions on Mathematical Software (TOMS)
The design of MA48: a code for the direct solution of sparse unsymmetric linear systems of equations
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Parallel reduction of banded matrices to bidiagonal form
Parallel Computing
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Algorithm 767: a Fortran 77 package for column reduction of polynomial matrices
ACM Transactions on Mathematical Software (TOMS)
ICS '90 Proceedings of the 4th international conference on Supercomputing
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Practical experience in the numerical dangers of heterogeneous computing
ACM Transactions on Mathematical Software (TOMS)
CALYPSO: a computer algebra library for parallel symbolic computation
PASCO '97 Proceedings of the second international symposium on Parallel symbolic computation
Compiler blockability of dense matrix factorizations
ACM Transactions on Mathematical Software (TOMS)
Level 3 basic linear algebra subprograms for sparse matrices: a user-level interface
ACM Transactions on Mathematical Software (TOMS)
Improving the memory-system performance of sparse-matrix vector multiplication
IBM Journal of Research and Development
The automatic generation of sparse primitives
ACM Transactions on Mathematical Software (TOMS)
Restructuring the BLAS level 1 routine for computing the modified givens transformation
ACM SIGNUM Newsletter
GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark
ACM Transactions on Mathematical Software (TOMS)
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Portable and efficient factorization algorithms on the IBM 3090/VF
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Squeezing the most out of an algorithm in CRAY FORTRAN
ACM Transactions on Mathematical Software (TOMS)
Matrix multiplication in an interleaved array processing architecture
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Direct numerical simulation of turbulence with a PC/linux cluster: fact or fiction?
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design and Performance Evaluation of a Portable Parallel Library for Space-Time Adaptive Processing
IEEE Transactions on Parallel and Distributed Systems
ACM Transactions on Mathematical Software (TOMS)
OoLALA: an object oriented analysis and design of numerical linear algebra
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Algorithm 539: Basic Linear Algebra Subprograms for Fortran Usage [F1]
ACM Transactions on Mathematical Software (TOMS)
Linearly Constrained Discrete I1 Problems
ACM Transactions on Mathematical Software (TOMS)
Algorithm 576: A FORTRAN Program for Solving Ax=b[F4]
ACM Transactions on Mathematical Software (TOMS)
Algorithm 580: QRUP: A Set of FORTRAN Routines for Updating QR Factorizations [F5]
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Algorithm 587: Two Algorithms for the Linearly Constrained Least Squares Problem
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Remark on “Algorithm 539: Basic Linear Algebra Subprograms for Fortran Usage”
ACM Transactions on Mathematical Software (TOMS)
Algorithm 596: a program for a locally parameterized
ACM Transactions on Mathematical Software (TOMS)
PSBLAS: a library for parallel linear algebra computation on sparse matrices
ACM Transactions on Mathematical Software (TOMS)
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
NetSolve: a network server for solving computational science problems
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Automatic translation of Fortran to JVM bytecode
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
A graphical tool for driving the parallel computation of pseudosprectra
ICS '01 Proceedings of the 15th international conference on Supercomputing
A recursive formulation of Cholesky factorization of a matrix in packed storage
ACM Transactions on Mathematical Software (TOMS)
FLAME: Formal Linear Algebra Methods Environment
ACM Transactions on Mathematical Software (TOMS)
Optimization of a parallel ocean general circulation model
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
An updated set of basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
Design, implementation and testing of extended and mixed precision BLAS
ACM Transactions on Mathematical Software (TOMS)
On computing givens rotations reliably and efficiently
ACM Transactions on Mathematical Software (TOMS)
Algorithm 818: A reference model implementation of the sparse BLAS in fortran 95
ACM Transactions on Mathematical Software (TOMS)
Preface to the special issue on the basic linear algebra subprograms (BLAS)
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Generic programming for high performance scientific applications
JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Automatic intra-register vectorization for the Intel architecture
International Journal of Parallel Programming
Automatic Intra-Register Vectorization for the Intel® Architecture
International Journal of Parallel Programming
Linear Algebra Libraries for High-Performance Computers: A Personal Perspective
IEEE Parallel & Distributed Technology: Systems & Technology
The Matrix Template Library: Generic Components for High-Performance Scientific Computing
Computing in Science and Engineering
The Decompositional Approach to Matrix Computation
Computing in Science and Engineering
Faster Numerical Algorithms Via Exception Handling
IEEE Transactions on Computers
An object-oriented programming of an explicit dynamics code: application to impact simulation
Advances in Engineering Software
Statistical Models for Automatic Performance Tuning
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
The Design of a Parallel Adaptive Multi-level Code in Fortran 90
ICCS '02 Proceedings of the International Conference on Computational Science-Part III
A Linear Algebra Formulation for Optimising Replication in Data Parallel Programs
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW
SAIG '00 Proceedings of the International Workshop on Semantics, Applications, and Implementation of Program Generation
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
An Evaluation of Java for Numerical Computing
ISCOPE '98 Proceedings of the Second International Symposium on Computing in Object-Oriented Parallel Environments
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Blocking Techniques in Numerical Software
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Expressing Irregular Computations in Modern Fortran Dialects
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
A Performance Study on a Single Processing Node of the HITACHI SR8000
NAA '00 Revised Papers from the Second International Conference on Numerical Analysis and Its Applications
Advanced environments for parallel and distributed applications: a view of current status
Parallel Computing - Special issue: Advanced environments for parallel and distributed computing
Formal derivation of algorithms: The triangular sylvester equation
ACM Transactions on Mathematical Software (TOMS)
NetSolve: A Network-Enabled Solver: Examples and Users
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Performance of various computers using standard linear equations software in a Fortran environment
ACM SIGARCH Computer Architecture News
Mathematical software: past, present, and future
Computational science, mathematics and software
Numerical algorithm delivery mechanisms
Computational science, mathematics and software
Sourcebook of parallel computing
ACM Transactions on Mathematical Software (TOMS)
High-performance linear algebra algorithms using new generalized data structures for matrices
IBM Journal of Research and Development
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Journal of Computational and Applied Mathematics
Journal of Computational and Applied Mathematics - Special issue: Selected papers from the 2nd international conference on advanced computational methods in engineering (ACOMEN2002) Liege University, Belgium, 27-31 May 2002
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Semi-formal design of reliable mesh generation systems
Advances in Engineering Software
Newton-Krylov continuation of periodic orbits for Navier-Stokes flows
Journal of Computational Physics
Supporting Cluster-Based Network Services on Functionally Symmetric Software Architecture
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Reducing Power with Performance Constraints for Parallel Sparse Applications
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
The science of deriving dense linear algebra algorithms
ACM Transactions on Mathematical Software (TOMS)
Representing linear algebra algorithms in code: the FLAME application program interfaces
ACM Transactions on Mathematical Software (TOMS)
Parallel out-of-core computation and updating of the QR factorization
ACM Transactions on Mathematical Software (TOMS)
Impact of the proposed IEEE floating point standard on numerical software
ACM SIGNUM Newsletter
The SLATEC mathematical subroutine library
ACM SIGNUM Newsletter
A proposal for an extended set of Fortran Basic Linear Algebra Subprograms
ACM SIGNUM Newsletter
Issues relating to extension of the Basic Linear Algebra Subprograms
ACM SIGNUM Newsletter
Proposed sparse extensions to the Basic Linear Algebra Subprograms
ACM SIGNUM Newsletter
ACM SIGNUM Newsletter
Programming tools for linear algebra
ACM SIGNUM Newsletter
A framework for adaptive algorithm selection in STAPL
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
A fully portable high performance minimal storage hybrid format Cholesky algorithm
ACM Transactions on Mathematical Software (TOMS)
A Neural Syntactic Language Model
Machine Learning
Journal of Computational and Applied Mathematics
High Performance Linear Algebra Operations on Reconfigurable Systems
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Design patterns and Fortran 90/95
ACM SIGPLAN Fortran Forum
SmartApps: middle-ware for adaptive applications on reconfigurable platforms
ACM SIGOPS Operating Systems Review
Applied Numerical Mathematics - The third international conference on the numerical solutions of volterra and delay equations, May 2004, Tempe, AZ
Parallel Computing - Algorithmic skeletons
An evaluation of Java for numerical computing
Scientific Programming
JLAPACK - compiling LAPACK Fortran to Java
Scientific Programming
Irregular computations in Fortran - expression and implementation strategies
Scientific Programming
Quantitative performance analysis of the SPEC OMPM2001 benchmarks
Scientific Programming - OpenMP
Design patterns for library optimization
Scientific Programming - POOSC '01 Workshop
BLASTH, a BLAS library for dual SMP computer
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
ACM Transactions on Mathematical Software (TOMS)
Parallel Languages and Compilers: Perspective From the Titanium Experience
International Journal of High Performance Computing Applications
Performance of various computers using standard linear equations software in a Fortran environment
ACM SIGARCH Computer Architecture News
Neural, Parallel & Scientific Computations
Scalable parallelization of FLAME code via the workqueuing model
ACM Transactions on Mathematical Software (TOMS)
High performance BLAS formulation of the multipole-to-local operator in the fast multipole method
Journal of Computational Physics
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A highly efficient implementation of a backpropagation learning algorithm using matrix ISA
Journal of Parallel and Distributed Computing
Journal of Computational Physics
The impact of paravirtualized memory hierarchy on linear algebra computational kernels and software
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Algorithm 887: CHOLMOD, Supernodal Sparse Cholesky Factorization and Update/Downdate
ACM Transactions on Mathematical Software (TOMS)
A simulator for adaptive parallel applications
Journal of Computer and System Sciences
Pattern-Driven Automatic Parallelization
Scientific Programming
Dynamic Supernodes in Sparse Cholesky Update/Downdate and Triangular Solves
ACM Transactions on Mathematical Software (TOMS)
The Mailman algorithm: A note on matrix--vector multiplication
Information Processing Letters
Design for Interoperability in stapl: pMatrices and Linear Algebra Algorithms
Languages and Compilers for Parallel Computing
Adaptive Winograd's matrix multiplications
ACM Transactions on Mathematical Software (TOMS)
Solving dense linear systems on platforms with multiple hardware accelerators
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Petascale computing with accelerators
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
LAPACK-Based Condition Estimates for the Discrete-Time LQG Design
Numerical Analysis and Its Applications
Programming the Linpack benchmark for the IBM PowerXCell 8i processor
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Anasazi software for the numerical solution of large-scale eigenvalue problems
ACM Transactions on Mathematical Software (TOMS)
Programming matrix algorithms-by-blocks for thread-level parallelism
ACM Transactions on Mathematical Software (TOMS)
C++ Bindings to External Software Libraries with Examples from BLAS, LAPACK, UMFPACK, and MUMPS
ACM Transactions on Mathematical Software (TOMS)
Streamlining Offload Computing to High Performance Architectures
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
From Silicon to Science: The Long Road to Production Reconfigurable Supercomputing
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Computational tools for the analysis of spatial patterns of gene expression in Common Lisp
Proceedings of the 2007 International Lisp Conference
On the Need for a Consortium of Capability Centers
International Journal of High Performance Computing Applications
Applied Numerical Mathematics
ACM Transactions on Mathematical Software (TOMS)
A message-passing hardware/software cosimulation environment for reconfigurable computing systems
International Journal of Reconfigurable Computing - Special issue on selected papers from ReConFig 2008
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Replacing square roots by Pythagorean sums
IBM Journal of Research and Development
Design and exploitation of a high-performance SIMD floating-point unit for Blue Gene/L
IBM Journal of Research and Development
Scaling LAPACK panel operations using parallel cache assignment
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Journal of Computational and Applied Mathematics
Rectangular full packed format for cholesky's algorithm: factorization, solution, and inversion
ACM Transactions on Mathematical Software (TOMS)
A collection of parallel linear equations routines for the Denelcor HEP
Parallel Computing
The impact of memory organization on the performance of matrix calculations
Parallel Computing
Paper: Toward a better parallel performance metric
Parallel Computing
Self-adapting numerical software and automatic tuning of heuristics
ICCS'03 Proceedings of the 2003 international conference on Computational science
Self-adapting numerical software and automatic tuning of heuristics
ICCS'03 Proceedings of the 2003 international conference on Computational science
Minimal data copy for dense linear algebra factorization
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Performance evaluation of basic linear algebra subroutines on a matrix co-processor
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Implementing and optimizing a data-intensive hydrodynamics application on the stream processor
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III
Composing parallel software efficiently with lithe
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Programming the Linpack benchmark for Roadrunner
IBM Journal of Research and Development
Optimization of triangular matrix functions in BLAS library on Loongson2F
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Using hybrid CPU-GPU platforms to accelerate the computation of the matrix sign function
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
The general matrix multiply-add operation on 2D torus
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A simulator for parallel applications with dynamically varying compute node allocation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Solving Very Sparse Rational Systems of Equations
ACM Transactions on Mathematical Software (TOMS)
Simple optimizations for an applicative array language for graphics processors
Proceedings of the sixth workshop on Declarative aspects of multicore programming
DESOLA: An active linear algebra library using delayed evaluation and runtime code generation
Science of Computer Programming
Exact solutions to linear systems of equations using output sensitive lifting
ACM Communications in Computer Algebra
Solving dense interval linear systems with verified computing on multicore architectures
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Modeling and predicting the efficiency of application execution in distributed environments
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Improving CSE software through reproducibility requirements
Proceedings of the 4th International Workshop on Software Engineering for Computational Science and Engineering
Numerical Python for scalable architectures
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
A domain-decomposing parallel sparse linear system solver
Journal of Computational and Applied Mathematics
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
The Combinatorial BLAS: design, implementation, and applications
International Journal of High Performance Computing Applications
Conditioning and error estimation in the numerical solution of matrix riccati equations
NAA'04 Proceedings of the Third international conference on Numerical Analysis and its Applications
Parallelising matrix operations on clusters for an optimal control-based quantum compiler
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Deciding where to call performance libraries
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
A matrix-type for performance–portability
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Rapid development of high-performance linear algebra libraries
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Semi-automatic generation of grid computing interfaces for numerical software libraries
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Parallelization of general matrix multiply routines using OpenMP
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
On domain-specific languages reengineering
GPCE'05 Proceedings of the 4th international conference on Generative Programming and Component Engineering
Data mining with parallel support vector machines for classification
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
MadLINQ: large-scale distributed matrix computation for the cloud
Proceedings of the 7th ACM european conference on Computer Systems
A generalization of s-step variants of gradient methods
Journal of Computational and Applied Mathematics
Foundations and Trends® in Machine Learning
Vectorizing codes for studying long-range transport of air pollutants
Mathematical and Computer Modelling: An International Journal
Running air pollution models on the connection machine
Mathematical and Computer Modelling: An International Journal
Concurrency and Computation: Practice & Experience
Journal of Parallel and Distributed Computing
GPU-based parallel algorithms for sparse nonlinear systems
Journal of Parallel and Distributed Computing
Concurrency and Computation: Practice & Experience
Generalizing matrix multiplication for efficient computations on modern computers
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Modeling performance through memory-stalls
ACM SIGMETRICS Performance Evaluation Review
Families of Algorithms for Reducing a Matrix to Condensed Form
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Parallelizing dense linear algebra operations with task queues in llc
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Expressing graph algorithms using generalized active messages
Proceedings of the 27th international ACM conference on International conference on supercomputing
Scaling LAPACK panel operations using parallel cache assignment
ACM Transactions on Mathematical Software (TOMS)
Cache efficient implementation for block matrix operations
Proceedings of the High Performance Computing Symposium
Discrete adjoints of PETSc through dco/c++ and adjoint MPI
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
A case study in mechanically deriving dense linear algebra code
International Journal of High Performance Computing Applications
Trends and outlook for the massive-scale analytics stack
IBM Journal of Research and Development
Hi-index | 0.04 |