Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
The cache performance and optimizations of blocked algorithms
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Unifying data and control transformations for distributed shared-memory machines
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Tile size selection using cache organization and data layout
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Unified compilation techniques for shared and distributed address space machines
ICS '95 Proceedings of the 9th international conference on Supercomputing
Evaluating the impact of advanced memory systems on compiler-parallelized codes
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Reduction of Cache Coherence Overhead by Compiler Data Layout and Loop Transformation
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Compiler Optimizations for Cache Locality and Coherence
Compiler Optimizations for Cache Locality and Coherence
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A hyperplane based approach for optimizing spatial locality in loop nests
ICS '98 Proceedings of the 12th international conference on Supercomputing
Eliminating conflict misses for high performance architectures
ICS '98 Proceedings of the 12th international conference on Supercomputing
Improving Cache Locality by a Combination of Loop and Data Transformations
IEEE Transactions on Computers - Special issue on cache memory and related problems
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
IEEE Transactions on Parallel and Distributed Systems
Locality optimizations for multi-level caches
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A compiler technique for improving whole-program locality
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Static and Dynamic Locality Optimizations Using Integer Linear Programming
IEEE Transactions on Parallel and Distributed Systems
Efficient Representation Scheme for Multidimensional Array Operations
IEEE Transactions on Computers
Register tiling in nonrectangular iteration spaces
ACM Transactions on Programming Languages and Systems (TOPLAS)
Integrating loop and data transformations for global optimization
Journal of Parallel and Distributed Computing
A Loop Transformation Algorithm Based on Explicit Data Layout Representation for Optimizing Locality
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
MARS: A Distributed Memory Approach to Shared Memory Compilation
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Array Unification: A Locality Optimization Technique
CC '01 Proceedings of the 10th International Conference on Compiler Construction
IEEE Transactions on Parallel and Distributed Systems
Array Regrouping and Its Use in Compiling Data-Intensive Embedded Applications
IEEE Transactions on Computers
A Constraint Network Based Approach to Memory Layout Optimization
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
A case for a working-set-based memory hierarchy
Proceedings of the 2nd conference on Computing frontiers
Improving whole-program locality using intra-procedural and inter-procedural transformations
Journal of Parallel and Distributed Computing
Optimization of memory system in real-time embedded systems
ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
Fast indexing for blocked array layouts to reduce cache misses
International Journal of High Performance Computing and Networking
Precise Management of Scratchpad Memories for Localising Array Accesses in Scientific Codes
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Code scheduling for optimizing parallelism and data locality
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Generating structured program instances with a high degree of locality
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Data locality and parallelism optimization using a constraint-based approach
Journal of Parallel and Distributed Computing
Hi-index | 0.01 |