Exploiting statistical correlations for proactive prediction of program behaviors

  • Authors:
  • Yunlian Jiang;Eddy Z. Zhang;Kai Tian;Feng Mao;Malcom Gethers;Xipeng Shen;Yaoqing Gao

  • Affiliations:
  • The College of William and Mary, Williamsburg, VA, USA;The College of William and Mary, Williamsburg, VA, USA;The College of William and Mary, Williamsburg, VA, USA;The College of William and Mary, Williamsburg, VA, USA;The College of William and Mary, Williamsburg, VA, USA;The College of William and Mary, Williamsburg, VA, USA;IBM Toronto Lab, Toronto, ON, Canada

  • Venue:
  • Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a finding and a technique on program behavior prediction. The finding is that surprisingly strong statistical correlations exist among the behaviors of different program components (e.g., loops) and among different types of program level behaviors (e.g., loop trip-counts versus data values). Furthermore, the correlations can be beneficially exploited: They help resolve the proactivity-adaptivity dilemma faced by existing program behavior predictions, making it possible to gain the strengths of both approaches--the large scope and earliness of offline-profiling--based predictions, and the cross-input adaptivity of runtime sampling-based predictions. The main technique contributed by this paper centers on a new concept, seminal behaviors. Enlightened by the existence of strong correlations among program behaviors, we propose a regression based framework to automatically identify a small set of behaviors that can lead to accurate prediction of other behaviors in a program. We call these seminal behaviors. By applying statistical learning techniques, the framework constructs predictive models that map from seminal behaviors to other behaviors, enabling proactive and cross-input adaptive prediction of program behaviors. The prediction helps a commercial compiler, the IBM XL C compiler, generate code that runs up to 45% faster (5%-13% on average), demonstrating the large potential of correlation-based techniques for program optimizations.