Path- and index-sensitive string analysis based on monadic second-order logic

  • Authors:
  • Takaaki Tateishi;Marco Pistoia;Omer Tripp

  • Affiliations:
  • IBM Research - Tokyo;IBM Research - T. J. Watson Research Center;IBM Software Group and Tel Aviv University

  • Venue:
  • Proceedings of the 2011 International Symposium on Software Testing and Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel technique for statically verifying the strings generated by a program. The verification is conducted by encoding the program in Monadic Second-Order Logic (M2L). We use M2L to describe constraints among program variables and to abstract built-in string operations. Once we encode a program in M2L, a theorem prover for M2L, such as MONA, can automatically check if a string generated by the program satisfies a given specification, and if not, exhibit a counterexample. With this approach, we can naturally encode relationships among strings, accounting also for cases in which a program manipulates strings using indices. In addition, our string analysis is path sensitive in that it accounts for the effects of string and Boolean comparisons, as well as regular-expression matches. We have implemented our string-analysis algorithm, and used it to augment an industrial security analysis for Web applications by automatically detecting and verifying sanitizers---methods that eliminate malicious patterns from untrusted strings, making those strings safe to use in security-sensitive operations. On the 8 benchmarks we analyzed, our string analyzer discovered 128 previously unknown sanitizers, compared to 71 sanitizers detected by a previously presented string analysis.