Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Compiler blockability of numerical algorithms
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Combining loop transformations considering caches and scheduling
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Fusion of Loops for Parallelism and Locality
IEEE Transactions on Parallel and Distributed Systems
A compiler algorithm for optimizing locality in loop nests
ICS '97 Proceedings of the 11th international conference on Supercomputing
Cache miss equations: an analytical representation of cache misses
ICS '97 Proceedings of the 11th international conference on Supercomputing
Automatic selection of high-order transformations in the IBM XL FORTRAN compilers
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Eliminating conflict misses for high performance architectures
ICS '98 Proceedings of the 12th international conference on Supercomputing
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
New tiling techniques to improve cache temporal locality
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Nonlinear array layouts for hierarchical memory systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
An experimental evaluation of tiling and shackling for memory hierarchy management
ICS '99 Proceedings of the 13th international conference on Supercomputing
A tile selection algorithm for data locality and cache interference
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Quantifying the Multi-level Nature of Tiling Interactions
LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
On Estimating and Enhancing Cache Effectiveness
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Collective Loop Fusion for Array Contraction
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
A Comparison of Compiler Tiling Algorithms
CC '99 Proceedings of the 8th International Conference on Compiler Construction, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'99
A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness
CASCON '94 Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Exploiting non-uniform reuse for cache optimization
Proceedings of the 2001 ACM symposium on Applied computing
Automatic Coarse Grain Task Parallel Processing on SMP Using OpenMP
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Cache Line Impact on 3D PDE Solvers
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Tiling, Block Data Layout, and Memory Hierarchy Performance
IEEE Transactions on Parallel and Distributed Systems
A Quantitative Analysis of Tile Size Selection Algorithms
The Journal of Supercomputing
Identifying and Exploiting Spatial Regularity in Data Memory References
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
An accurate cost model for guiding data locality transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fast indexing for blocked array layouts to reduce cache misses
International Journal of High Performance Computing and Networking
Multi-level tiling: M for the price of one
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Parametric multi-level tiling of imperfectly nested loops
Proceedings of the 23rd international conference on Supercomputing
Coarse grain task parallel processing with cache optimization on shared memory multiprocessor
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Parameterized tiling revisited
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Practical loop transformations for tensor contraction expressions on multi-level memory hierarchies
CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
Optimizing matrix multiplication with a classifier learning system
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Tuning blocked array layouts to exploit memory hierarchy in SMT architectures
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
An ILP-Based approach to locality optimization
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Hierarchical parallelism control for multigrain parallel processing
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Near-optimal padding for removing conflict misses
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Analytical bounds for optimal tile size selection
CC'12 Proceedings of the 21st international conference on Compiler Construction
Hi-index | 0.00 |