Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Avoiding conflict misses dynamically in large direct-mapped caches
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Reducing false sharing on shared memory multiprocessors through compile time data transformations
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
A quantitative analysis of loop nest locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Fusion of Loops for Parallelism and Locality
IEEE Transactions on Parallel and Distributed Systems
Eliminating cache conflict misses through XOR-based placement functions
ICS '97 Proceedings of the 11th international conference on Supercomputing
A compiler algorithm for optimizing locality in loop nests
ICS '97 Proceedings of the 11th international conference on Supercomputing
Non-singular data transformations: definition, validity and applications
ICS '97 Proceedings of the 11th international conference on Supercomputing
Cache miss equations: an analytical representation of cache misses
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
On Estimating and Enhancing Cache Effectiveness
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
New tiling techniques to improve cache temporal locality
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Nonlinear array layouts for hierarchical memory systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
Locality optimizations for multi-level caches
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Memory characteristics of iterative methods
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Loop optimization for a class of memory-constrained computations
ICS '01 Proceedings of the 15th international conference on Supercomputing
Exact analysis of the cache behavior of nested loops
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
The hardness of cache conscious data placement
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Tight bounds on cache use for stencil operations on rectangular grids
Journal of the ACM (JACM)
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
A Compiler Framework for Tiling Imperfectly-Nested Loops
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Compiler and Run-Time Support for Improving Locality in Scientific Codes
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Improving Cache Effectiveness through Array Data Layout Manipulation in SAC
IFL '00 Selected Papers from the 12th International Workshop on Implementation of Functional Languages
A Quantitative Analysis of Tile Size Selection Algorithms
The Journal of Supercomputing
Improving Data Locality by Array Contraction
IEEE Transactions on Computers
A Geometric Programming Framework for Optimal Multi-Level Tiling
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
An accurate cost model for guiding data locality transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Analyzing data reuse for cache reconfiguration
ACM Transactions on Embedded Computing Systems (TECS)
Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver
Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
The hardness of cache conscious data placement
Nordic Journal of Computing
Partitioning and scheduling DSP applications with maximal memory access hiding
EURASIP Journal on Applied Signal Processing
Locality enhancement by array contraction
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Tuning blocked array layouts to exploit memory hierarchy in SMT architectures
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Near-optimal padding for removing conflict misses
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |