Field-sensitive program dependence analysis

  • Authors:
  • Shay Litvak;Nurit Dor;Rastislav Bodik;Noam Rinetzky;Mooly Sagiv

  • Affiliations:
  • Tel Aviv University & Panaya Inc., Tel-Aviv, Israel;Panaya Inc., Rannana, Israel;University of California, Berkeley, Berkeley, CA, USA;Queen Mary University of London, London, United Kingdom;Tel Aviv University, Tel Aviv, Israel

  • Venue:
  • Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Statement st transitively depends on statement stseed if the execution of stseed may affect the execution of st. Computing transitive program dependences is a fundamental operation in many automatic software analysis tools. Existing tools find it challenging to compute transitive dependences for programs manipulating large aggregate structure variables, and their limitations adversely affect analysis of certain important classes of software systems, e.g., large-scale enterprise resource planning (ERP) systems. This paper presents an efficient conservative interprocedural static analysis algorithm for computing field-sensitive transitive program dependences in the presence of large aggregate structure variables. Our key insight is that program dependences coming from operations on whole substructures can be precisely (i.e., field-sensitively) represented at the granularity of substructures instead of individual fields. Technically, we adapt the interval domain to concisely record dependences between multiple pairs of fields of aggregate structure variables by exploiting the fields' spatial arrangement. We prove that our algorithm is as precise as any algorithm which works at the granularity of individual fields, the most-precise known approach for this problem. Our empirical study, in which we analyzed industrial ERP programs with over 100,000 lines of code in average, shows significant improvements in both the running times and memory consumption over existing approaches: The baseline is an efficient field-insensitive whole-structure that incurs a 62% false error rate. An atomization-based algorithm, which disassemble every aggregate structure variable into the collection of its individual fields, can remove all these false errors at the cost of doubling the average analysis time, from 30 to 60 minutes. In contrast, our new precise algorithm removes all false errors by increasing the time only to 35 minutes. In terms of memory consumption, our algorithm increases the footprint by less than 10%, compared to 50% overhead of the atomizing algorithm.