Grammar-based whitebox fuzzing

Authors:
Patrice Godefroid;Adam Kiezun;Michael Y. Levin
Affiliations:
Microsoft Research, Redmond, WA, USA;Massachusetts Institute of Technology, Cambridge, MA, USA;Microsoft Center for Software Excellence, Redmond, WA, USA
Venue:
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Year:
2008

Citing 27
Cited 44

Scannerless NSLR(1) parsing of programming languages

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
An empirical study of the reliability of UNIX utilities

Communications of the ACM
Interconvertbility of set constraints and context-free language reachability

PEPM '97 Proceedings of the 1997 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Using production grammars in software testing

Proceedings of the 2nd conference on Domain-specific languages
QuickCheck: a lightweight tool for random testing of Haskell programs

ICFP '00 Proceedings of the fifth ACM SIGPLAN international conference on Functional programming
Symbolic execution and program testing

Communications of the ACM
Korat: automated testing based on Java predicates

ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
Generating Test Data with Enhanced Context-Free Grammars

IEEE Software
Removing left recursion from context-free grammars

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Software assurance by bounded exhaustive testing

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
TestEra: Specification-Based Testing of Java Programs Using SAT

Automated Software Engineering
DART: directed automated random testing

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
CUTE: a concolic unit testing engine for C

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
yagg: an easy-to-use generator for structured test inputs

Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
binpac: a yacc for writing application protocol parsers

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Introduction to Automata Theory, Languages, and Computation (3rd Edition)

Introduction to Automata Theory, Languages, and Computation (3rd Edition)
EXE: automatically generating inputs of death

Proceedings of the 13th ACM conference on Computer and communications security
Compositional dynamic test generation

Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Feedback-Directed Random Test Generation

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Sound and precise analysis of web applications for injection vulnerabilities

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
An empirical study of the robustness of Windows NT applications using random testing

WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
Dynamic test input generation for database applications

Proceedings of the 2007 international symposium on Software testing and analysis
Automated testing of refactoring engines

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Directed test generation using symbolic grammars

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Fuzzing: Brute Force Vulnerability Discovery

Fuzzing: Brute Force Vulnerability Discovery
Discoverer: automatic protocol reverse engineering from network traces

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Automatic generation of random self-checking test cases

IBM Systems Journal

Finding bugs in dynamic web applications

ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
Deriving input syntactic structure from execution

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
A decision procedure for subset constraints over regular languages

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Taint-based directed whitebox fuzzing

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
HAMPI: a solver for string constraints

Proceedings of the eighteenth international symposium on Software testing and analysis
Loop-extended symbolic execution on binary programs

Proceedings of the eighteenth international symposium on Software testing and analysis
Fuzzing and delta-debugging SMT solvers

Proceedings of the 7th International Workshop on Satisfiability Modulo Theories
Towards Generating High Coverage Vulnerability-Based Signatures with Protocol-Level Constraint-Guided Exploration

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
Reverse engineering of binary device drivers with RevNIC

Proceedings of the 5th European conference on Computer systems
Test generation through programming in UDITA

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
On test repair using symbolic execution

Proceedings of the 19th international symposium on Software testing and analysis
Solving string constraints lazily

Proceedings of the IEEE/ACM international conference on Automated software engineering
Input generation via decomposition and re-stitching: finding bugs in Malware

Proceedings of the 17th ACM conference on Computer and communications security
An autonomic testing framework for IPv6 configuration protocols

AIMS'10 Proceedings of the Mechanisms for autonomous management of networks and services, and 4th international conference on Autonomous infrastructure, management and security
Stable deterministic multithreading through schedule memoization

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
An evaluation of automata algorithms for string analysis

VMCAI'11 Proceedings of the 12th international conference on Verification, model checking, and abstract interpretation
Striking a new balance between program instrumentation and debugging time

Proceedings of the sixth conference on Computer systems
Higher-order test generation

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Finding and understanding bugs in C compilers

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Checksum-Aware Fuzzing Combined with Dynamic Taint Analysis and Symbolic Execution

ACM Transactions on Information and System Security (TISSEC)
Path exploration based on symbolic output

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
A random testing approach using pushdown automata

TAP'11 Proceedings of the 5th international conference on Tests and proofs
HAMPI: a string solver for testing, analysis and vulnerability detection

CAV'11 Proceedings of the 23rd international conference on Computer aided verification
H-fuzzing: a new heuristic method for fuzzing data generation

NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
SimFuzz: Test case similarity directed deep fuzzing

Journal of Systems and Software
Abstracting path conditions

Proceedings of the 2012 International Symposium on Software Testing and Analysis
Evaluating program analysis and testing tools with the RUGRAT random benchmark application generator

Proceedings of the 2012 Workshop on Dynamic Analysis
Fuzzing with code fragments

Security'12 Proceedings of the 21st USENIX conference on Security symposium
HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars

ACM Transactions on Software Engineering and Methodology (TOSEM)
Testing android apps through symbolic execution

ACM SIGSOFT Software Engineering Notes
White box sampling in uncertain data processing enabled by program analysis

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Automated concolic testing of smartphone apps

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Learning fine-grained structured input for memory corruption detection

ISC'12 Proceedings of the 15th international conference on Information Security
Comparative language fuzz testing: programming languages vs. fat fingers

Proceedings of the ACM 4th annual workshop on Evaluation and usability of programming languages and tools
Down to the bare metal: using processor features for binary analysis

Proceedings of the 28th Annual Computer Security Applications Conference
The optimisation of stochastic grammars to enable cost-effective probabilistic structural testing

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Semi-valid input coverage for fuzz testing

Proceedings of the 2013 International Symposium on Software Testing and Analysis
Fuzzing the ActionScript virtual machine

Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security
State of the art: Dynamic symbolic execution for automated test generation

Future Generation Computer Systems
Making automated testing of cloud applications an integral component of PaaS

Proceedings of the 4th Asia-Pacific Workshop on Systems
An orchestrated survey of methodologies for automated software test case generation

Journal of Systems and Software
Obfuscation resilient binary code reuse through trace-oriented programming

Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Path exploration based on symbolic output

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Systematic testing of refactoring engines on real software projects

ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages. In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.