Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Applications of spatial data structures: Computer graphics, image processing, and GIS
Applications of spatial data structures: Computer graphics, image processing, and GIS
Improving register allocation for subscripted variables
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A parallel hashed Oct-Tree N-body algorithm
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Effective cache prefetching on bus-based multiprocessors
ACM Transactions on Computer Systems (TOCS)
Journal of Parallel and Distributed Computing
ICS '95 Proceedings of the 9th international conference on Supercomputing
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data-centric multi-level blocking
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Organizing matrices and matrix operations for paged memory systems
Communications of the ACM
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Tuning Strassen's matrix multiplication for memory efficiency
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
On Estimating and Enhancing Cache Effectiveness
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
On Partitioning Dynamic Adaptive Grid Hierarchies
HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 1: Software Technology and Architecture
Memory Hierarchy Management for Iterative Graph Structures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Improving the performance of virtual memory computers.
Improving the performance of virtual memory computers.
Software methods for improvement of cache performance on supercomputer applications
Software methods for improvement of cache performance on supercomputer applications
Code transformations to improve memory parallelism
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications
IEEE Transactions on Parallel and Distributed Systems
Transforming loops to recursion for multi-level memory hierarchies
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Improving fine-grained irregular shared-memory benchmarks by data reordering
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
ICS '01 Proceedings of the 15th international conference on Supercomputing
Irregularity in multi-dimensional space-filling curves with applications in multimedia databases
Proceedings of the tenth international conference on Information and knowledge management
Data Relation Vectors: A New Abstraction for Data Optimizations
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Scaling irregular parallel codes with minimal programming effort
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
International Journal of Parallel Programming
Rescheduling for Locality in Sparse Matrix Computations
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Compiler and Runtime Support for Irregular Reductions on a Multithreaded Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Comparison of Parallelization Techniques for Irregular Reductions
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Improving Locality for Adaptive Irregular Scientific Codes
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
Reducing Communication Cost for Parallelizing Irregular Scientific Codes
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
A Comparison of Locality Transformations for Irregular Codes
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Array Unification: A Locality Optimization Technique
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Compile-time composition of run-time data and iteration reorderings
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Analysis of Multi-Dimensional Space-Filling Curves
Geoinformatica
Sourcebook of parallel computing
Array Regrouping and Its Use in Compiling Data-Intensive Embedded Applications
IEEE Transactions on Computers
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
SFCGen: A framework for efficient generation of multi-dimensional space-filling curves by recursion
ACM Transactions on Mathematical Software (TOMS)
Metrics and models for reordering transformations
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Parallel techniques in irregular codes: cloth simulation as case of study
Journal of Parallel and Distributed Computing
Improving the energy behavior of block buffering using compiler optimizations
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Exploiting Locality for Irregular Scientific Codes
IEEE Transactions on Parallel and Distributed Systems
Localization and coverage for high density sensor networks
Computer Communications
Runtime characterisation of irregular accesses applied to parallelisation of irregular reductions
International Journal of Computational Science and Engineering
An analytical model of locality-based parallel irregular reductions
Parallel Computing
Exploring a Novel Gathering Method for Finite Element Codes on the Cell/B.E. Architecture
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Region-based parallelization of irregular reductions on explicitly managed memory hierarchies
The Journal of Supercomputing
Proceedings of the international conference on Supercomputing
Applying data copy to improve memory performance of general array computations
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A study on load imbalance in parallel hypermatrix multiplication using OpenMP
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Symmetrical data sieving for noncontiguous i/o accesses in molecular dynamics simulations
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Combining performance aspects of irregular gauss-seidel via sparse tiling
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |