Array regrouping on CMP with non-uniform cache sharing

  • Authors:
  • Yunlian Jiang;Eddy Z. Zhang;Xipeng Shen;Yaoqing Gao;Roch Archambault

  • Affiliations:
  • Computer Science Department, The College of William and Mary, Williamsburg, VA;Computer Science Department, The College of William and Mary, Williamsburg, VA;Computer Science Department, The College of William and Mary, Williamsburg, VA;IBM Toronto Software Lab, Toronto, Canada;IBM Toronto Software Lab, Toronto, Canada

  • Venue:
  • LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Array regrouping enhances program spatial locality by interleaving elements of multiple arrays that tend to be accessed closely. Its effectiveness has been systematically studied for sequential programs running on unicore processors, but not for multithreading programs on modern ChipMultiprocessor (CMP) machines. On one hand, the processor-level parallelism on CMP intensifies memory bandwidth pressure, suggesting the potential benefits of array regrouping for CMP computing. On the other hand, CMP architectures exhibit extra complexities-- especially the hierarchical, heterogeneous cache sharing among hyperthreads, cores, and processors--that impose new challenges to array regrouping. In this work, we initiate an exploration to the new opportunities and challenges. We propose cache-sharing-aware reference affinity analysis for identifying data affinity in multithreading applications. The analysis consists of affinity-guided thread scheduling and hierarchical reference-vector merging, handles cache sharing among both hyperthreads and cores, and offers hints for array regrouping and the avoidance of false sharing. Preliminary experiments demonstrate the potential of the techniques in improving locality of multithreading applications on CMP with various pitfalls avoided.