Probabilistic, modular and scalable inference of typestate specifications

  • Authors:
  • Nels E. Beckman;Aditya V. Nori

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Microsoft Research India, Bangalore, India

  • Venue:
  • Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Static analysis tools aim to find bugs in software that correspond to violations of specifications. Unfortunately, for large and complex software, these specifications are usually either unavailable or sophisticated, and hard to write. This paper presents ANEK, a tool and accompanying methodology for inferring specifications useful for modular typestate checking of programs. In particular, these specifications consist of pre and postconditions along with aliasing annotations known as access permissions. A novel feature of ANEK is that it can generate program specifications even when the code under analysis gives rise to conflicting constraints, a situation that typically occurs when there are bugs. The design of ANEK also makes it easy to add heuristic constraints that encode intuitions gleaned from several years of experience writing such specifications, and this allows it to infer specifications that are better in a subjective sense. The ANEK algorithm is based on a modular analysis that makes it fast and scalable, while producing reliable specifications. All of these features are enabled by its underlying probabilistic analysis that produces specifications that are very likely. Our implementation of ANEK infers access permissions specifications used by the PLURAL [5] modular typestate checker for Java programs. We have run ANEK on a number of Java benchmark programs, including one large open-source program(approximately 38K lines of code), to infer specifications that were then checked using PLURAL. The results for the large benchmark show that ANEK can quickly infer specifications that are both accurate and qualitatively similar to those written by hand, and at 5% of the time taken to manually discover and hand-code the specifications.