Go with the flow: profiling copies to find runtime bloat

  • Authors:
  • Guoqing Xu;Matthew Arnold;Nick Mitchell;Atanas Rountev;Gary Sevitsky

  • Affiliations:
  • Ohio State University, Columbus, OH, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA;Ohio State University, Columbus, OH, USA;IBM T.J. Watson Research Center, Hawthorne, NY, USA

  • Venue:
  • Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many large-scale Java applications suffer from runtime bloat. They execute large volumes of methods, and create many temporary objects, all to execute relatively simple operations. There are large opportunities for performance optimizations in these applications, but most are being missed by existing optimization and tooling technology. While JIT optimizations struggle for a few percent, performance experts analyze deployed applications and regularly find gains of 2x or more. Finding such big gains is difficult, for both humans and compilers, because of the diffuse nature of runtime bloat. Time is spread thinly across calling contexts, making it difficult to judge how to improve performance. Bloat results from a pile-up of seemingly harmless decisions. Each adds temporary objects and method calls, and often copies values between those temporary objects. While data copies are not the entirety of bloat, we have observed that they are excellent indicators of regions of excessive activity. By optimizing copies, one is likely to remove the objects that carry copied values, and the method calls that allocate and populate them. We introduce copy profiling, a technique that summarizes runtime activity in terms of chains of data copies. A flat copy profile counts copies by method. We show how flat profiles alone can be helpful. In many cases, diagnosing a problem requires data flow context. Tracking and making sense of raw copy chains does not scale, so we introduce a summarizing abstraction called the copy graph. We implement three clients analyses that, using the copy graph, expose common patterns of bloat, such as finding hot copy chains and discovering temporary data structures. We demonstrate, with examples from a large-scale commercial application and several benchmarks, that copy profiling can be used by a programmer to quickly find opportunities for large performance gains.