Introduction to algorithms
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
An analysis of dag-consistent distributed shared-memory algorithms
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Locality of Reference in LU Decomposition with Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
Advanced compiler design and implementation
Advanced compiler design and implementation
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The influence of caches on the performance of sorting
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Tuning Strassen's matrix multiplication for memory efficiency
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Space-limited procedures: a methodology for portable high-performance
PMMP '95 Proceedings of the conference on Programming Models for Massively Parallel Computers
The Fastest Fourier Transform in the West
The Fastest Fourier Transform in the West
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
LAPACK Working Note 20: A Portable Linear Algebra Library For High-Performance Computers
Algorithmic skeletons: a structured approach to the management of parallel computation
Algorithmic skeletons: a structured approach to the management of parallel computation
Portable high-performance programs
Portable high-performance programs
Transforming loops to recursion for multi-level memory hierarchies
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Optimizing locality for ODE solvers
ICS '01 Proceedings of the 15th international conference on Supercomputing
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Pipelining for Locality Improvement in RK Methods
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Satin: Efficient Parallel Divide-and-Conquer in Java
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Dynamic Partitioning of the Divide-and-Conquer Scheme with Migration in PVM Environment
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Performance optimization of RK methods using block-based pipelining
Performance analysis and grid computing
Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms
International Journal of High Performance Computing Applications
Statistical Models for Empirical Search-Based Performance Tuning
International Journal of High Performance Computing Applications
Sparse Tiling for Stationary Iterative Methods
International Journal of High Performance Computing Applications
Optimizing locality and scalability of embedded Runge--Kutta solvers using block-based pipelining
Journal of Parallel and Distributed Computing
FFT program generation for shared memory: SMP and multicore
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Improving locality for ODE solvers by program transformations
Scientific Programming
Hi-index | 0.00 |