Hash-flow taint analysis of higher-order programs

  • Authors:
  • Shuying Liang;Matthew Might

  • Affiliations:
  • University of Utah;University of Utah

  • Venue:
  • Proceedings of the 7th Workshop on Programming Languages and Analysis for Security
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input. We propose static taint analysis, which can verify the absence of unsanitized input errors at compile-time. Unfortunately, precise static analysis of modern scripting languages like Python is challenging: higher-orderness and complex control-flow collide with opaque, dynamic data structures like hash maps and objects. The interdependence of data-flow and control-flow make it hard to attain both soundness and precision. In this work, we apply abstract interpretation to sound and precise taint-style static analysis of scripting languages. We first define λH, a core calculus of modern scripting languages, with hash maps, dynamic objects, higher-order functions and first class control. Then we derive a framework of k-CFA-like CESK-style abstract machines for statically reasoning about λH, but with hash maps factored into a "Curried Object store." The Curried object store---and shape analysis on this store---allows us to recover field sensitivity, even in the presence of dynamically modified fields. Lastly, atop this framework, we devise a taint-flow analysis, leveraging its field-sensitive, interprocedural and context-sensitive properties to soundly and precisely detect security vulnerabilities, like XSS attacks in web applications. We have prototyped the analytical framework for Python, and conducted preliminary experiments with web applications. A low rate of false alarms demonstrates the promise of this approach.