A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
LAPACK's user's guide
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Eliminating conflict misses for high performance architectures
ICS '98 Proceedings of the 12th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Efficient Memory Programming
Computer
The Fastest Fourier Transform in the West
The Fastest Fourier Transform in the West
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Optimizing locality for ODE solvers
ICS '01 Proceedings of the 15th international conference on Supercomputing
Experiences tuning SMG98: a semicoarsening multigrid benchmark based on the hypre library
ICS '02 Proceedings of the 16th international conference on Supercomputing
Data Layout Optimizations for Variable Coefficient Multigrid
ICCS '02 Proceedings of the International Conference on Computational Science-Part III
Treating a User-Defined Parallel Library as a Domain-Specific Language
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance Optimization of 3D Multigrid on Hierarchical Memory Architectures
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Pipelining for Locality Improvement in RK Methods
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
An Analytical Evaluation of Tiling for Stencil Codes with Time Loop
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance optimization of RK methods using block-based pipelining
Performance analysis and grid computing
Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors
Proceedings of the 20th annual international conference on Supercomputing
Improving locality for ODE solvers by program transformations
Scientific Programming
Applied Numerical Mathematics
Hierarchical hybrid grids: achieving TERAFLOP performance on large scale finite element simulations
International Journal of Parallel, Emergent and Distributed Systems
Reconsidering algorithms for iterative solvers in the multicore era
International Journal of Computational Science and Engineering
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Enhancing the performance of multigrid smoothers in simultaneous multithreading architectures
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Hi-index | 0.00 |