Grammar-based whitebox fuzzing

  • Authors:
  • Patrice Godefroid;Adam Kiezun;Michael Y. Levin

  • Affiliations:
  • Microsoft Research, Redmond, WA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA;Microsoft Center for Software Excellence, Redmond, WA, USA

  • Venue:
  • Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages. In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.