Adaptive and speculative memory consistency support for multi-core architectures with on-chip local memories

Authors:
Nikola Vujic;Lluc Alvarez;Marc Gonzalez Tallada;Xavier Martorell;Eduard Ayguadé
Affiliations:
Barcelona Supercomputing Center – Centro Nacional de Supercomputación;Barcelona Supercomputing Center – Centro Nacional de Supercomputación;Technical University of Catalonia;,Barcelona Supercomputing Center – Centro Nacional de Supercomputación;,Barcelona Supercomputing Center – Centro Nacional de Supercomputación
Venue:
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Year:
2009

Citing 11
Cited 0

Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Pointer analysis for structured parallel programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
An Empirical Study of Fortran Programs for Parallelizing Compilers

IEEE Transactions on Parallel and Distributed Systems
Using advanced compiler technology to exploit the performance of the Cell Broadband EngineTM architecture

IBM Systems Journal
Cell Multiprocessor Communication Network: Built for Speed

IEEE Micro
Prefetching irregular references for software cache on cell

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Orchestrating data transfer for the cell/B.E. processor

Proceedings of the 22nd annual international conference on Supercomputing
Hybrid access-specific software cache techniques for the cell BE architecture

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture

Languages and Compilers for Parallel Computing
Optimizing the use of static buffers for DMA on a CELL chip

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
The OpenMP memory model

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software cache has been showed as a robust approach in multi-core systems with no hardware support for transparent data transfers between local and global memories. Software cache provides the user with a transparent view of the memory architecture and considerably improves the programmability of such systems. But this software approach can suffer from poor performance due to considerable overheads related to software mechanisms to maintain the memory consistency. This paper presents a set of alternatives to smooth their impact. A specific write-back mechanism is introduced based on some degree of speculation regarding the number of threads actually modifying the same cache lines. A case study based on the Cell BE processor is described. Performance evaluation indicates that improvements due to the optimized software-cache structures combined with the proposed code-optimizations translate into 20% up to 40% speedup factors, compared to a traditional software cache approach.