Is reuse distance applicable to data locality analysis on chip multiprocessors?

  • Authors:
  • Yunlian Jiang;Eddy Z. Zhang;Kai Tian;Xipeng Shen

  • Affiliations:
  • Computer Science Department, The College of William and Mary, Williamsburg, VA;Computer Science Department, The College of William and Mary, Williamsburg, VA;Computer Science Department, The College of William and Mary, Williamsburg, VA;Computer Science Department, The College of William and Mary, Williamsburg, VA

  • Venue:
  • CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

On Chip Multiprocessors (CMP), it is common that multiple cores share certain levels of cache. The sharing increases the contention in cache and memory-to-chip bandwidth, further highlighting the importance of data locality analysis. As a rigorous and hardware-independent locality metric, reuse distance has served for a variety of locality analysis, program transformations, and performance prediction. However, previous studies have concentrated on sequential programs running on unicore processors. On CMP, accesses by different threads (or jobs) interact in the shared cache. How reuse distance applies to the new architecture remains an open question—particularly, how the interactions in shared cache affect the collection and application of reuse distance, and how reuse-distance–based locality analysis should adapt to such architecture changes. This paper presents our explorations towards answering those questions. It first introduces the concept of concurrent reuse distance, a direct extension of the traditional concept of reuse distance with data references by all co-running threads (or jobs) considered. It then discusses the properties of concurrent reuse distance, revealing the special challenges facing the collection and application of concurrent reuse distance on CMP platforms. Finally, it presents the solutions to those challenges for a class of multithreading applications. The solutions center on a probabilistic model that connects concurrent reuse distance with the data locality of each individual thread. Experiments demonstrate the effectiveness of the proposed techniques in facilitating the uses of concurrent reuse distance for CMP computing.