Sound and precise analysis of web applications for injection vulnerabilities

  • Authors:
  • Gary Wassermann;Zhendong Su

  • Affiliations:
  • University of California: Davis, Davis, CA;University of California: Davis, Davis, CA

  • Venue:
  • Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web applications are popular targets of security attacks. One common type of such attacks is SQL injection, where an attacker exploits faulty application code to execute maliciously crafted database queries. Bothstatic and dynamic approaches have been proposed to detect or prevent SQL injections; while dynamic approaches provide protection for deployed software, static approaches can detect potential vulnerabilities before software deployment. Previous static approaches are mostly based on tainted information flow tracking and have at least some of the following limitations: (1) they do not model the precise semantics of input sanitization routines; (2) they require manually written specifications, either for each query or for bug patterns; or (3) they are not fully automated and may require user intervention at various points in the analysis. In this paper, we address these limitations by proposing a precise, sound, and fully automated analysis technique for SQL injection. Our technique avoids the need for specifications by consideringas attacks those queries for which user input changes the intended syntactic structure of the generated query. It checks conformance to this policy byconservatively characterizing the values a string variable may assume with a context free grammar, tracking the nonterminals that represent user-modifiable data, and modeling string operations precisely as language transducers. We have implemented the proposed technique for PHP, the most widely-used web scripting language. Our tool successfully discovered previously unknown and sometimes subtle vulnerabilities in real-world programs, has a low false positive rate, and scales to large programs (with approx. 100K loc).