Reducing Cache Conflicts by a Parametrized Memory Mapping

Authors:
Daniela Genius;Jörn Eisenbiegler
Affiliations:
-;-
Venue:
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Year:
1999

Citing 13
Cited 0

The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Tile size selection using cache organization and data layout

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
A quantitative analysis of loop nest locality

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Data transformations for eliminating conflict misses

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Improving cache Performance Through Tiling and Data Alignment

IRREGULAR '97 Proceedings of the 4th International Symposium on Solving Irregularly Structured Problems in Parallel
Optimization of SIMD Programs with Redundant Computations

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Handling Cross Interferences by Cyclic Cache Line Coloring

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Access and Alignment of Data in an Array Processor

IEEE Transactions on Computers
A study of interleaved memory systems

AFIPS '70 (Spring) Proceedings of the May 5-7, 1970, spring joint computer conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Algorithms which access memory regularly are typical for scientific computing, image processing and multimedia. Cache conflicts are often responsible for performance degradation, but can be avoided by an adequate placement of data in memory. The huge search space for such compile time placements is systematically reduced until we arrive at a class of very simple mappings, well known from data distribution onto processors in parallel computing. The choice of parameters is then guided by a cost function which reflects the tradeoff between additional instruction overhead and reduced miss penalty. We show by experiment that when keeping the overhead low, a considerable speedup can be achieved.