A tile selection algorithm for data locality and cache interference

Authors:
Jacqueline Chame;Sungdo Moon
Affiliations:
Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California
Venue:
ICS '99 Proceedings of the 13th international conference on Supercomputing
Year:
1999

Citing 11
Cited 27

More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Improving locality and parallelism in nested loops

Improving locality and parallelism in nested loops
Tile size selection using cache organization and data layout

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Influence of cross-interferences on blocked loops: a case study with matrix-vector multiply

ACM Transactions on Programming Languages and Systems (TOPLAS)
Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor

Digital Technical Journal - Special 10th anniversary issue
Cache miss equations: an analytical representation of cache misses

ICS '97 Proceedings of the 11th international conference on Supercomputing
Precise miss analysis for program transformations with caches of arbitrary associativity

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
On Estimating and Enhancing Cache Effectiveness

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
A compiler analysis of cache interference and its applications to compiler optimizations

A compiler analysis of cache interference and its applications to compiler optimizations

Locality optimizations for multi-level caches

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Optimal partitioning and balanced scheduling with the maximal overlap of data footprints

GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Tiling optimizations for 3D scientific computations

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Minimizing Average Schedule Length under Memory Constraints by Optimal Partitioning and Prefetching

Journal of VLSI Signal Processing Systems
Scheduling and partitioning for multiple loop nests

Proceedings of the 14th international symposium on Systems synthesis
Combined partitioning and data padding for scheduling multiple loop nests

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Iterative Compilation

Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Iterative compilation

Embedded processor design challenges
A Quantitative Analysis of Tile Size Selection Algorithms

The Journal of Supercomputing
Automatic tiling of iterative stencil loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
A Geometric Programming Framework for Optimal Multi-Level Tiling

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Empirical optimization for a sparse linear solver: a case study

International Journal of Parallel Programming - Special issue: The next generation software program
Partitioning and scheduling DSP applications with maximal memory access hiding

EURASIP Journal on Applied Signal Processing
Fast indexing for blocked array layouts to reduce cache misses

International Journal of High Performance Computing and Networking
Dynamic tiling for effective use of shared caches on multithreaded processors

International Journal of High Performance Computing and Networking
Positivity, posynomials and tile size selection

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Simultaneous minimization of capacity and conflict misses

Journal of Computer Science and Technology
Automatic creation of tile size selection models

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
On the interaction of tiling and automatic parallelization

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Tuning blocked array layouts to exploit memory hierarchy in SMT architectures

PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Analytical bounds for optimal tile size selection

CC'12 Proceedings of the 21st international conference on Compiler Construction
Benefits of using parallelized non-progressive network coding

Journal of Network and Computer Applications
Automatic OpenCL work-group size selection for multicore CPUs

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Adaptive Mapping and Parameter Selection Scheme to Improve Automatic Code Generation for GPUs

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Tile size selection revisited

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

A tile selection algorithm for data locality and cache interference

Quantified Score

Visualization

Abstract