An empirical study of the reliability of UNIX utilities
Communications of the ACM
CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs
CC '02 Proceedings of the 11th International Conference on Compiler Construction
ITS4: A static vulnerability scanner for C and C++ code
ACSAC '00 Proceedings of the 16th Annual Computer Security Applications Conference
Testing static analysis tools using exploitable buffer overflows from open source code
Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
DART: directed automated random testing
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
CUTE: a concolic unit testing engine for C
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
EXE: automatically generating inputs of death
Proceedings of the 13th ACM conference on Computer and communications security
Compositional dynamic test generation
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Address obfuscation: an efficient approach to combat a board range of memory error exploits
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks
SSYM'98 Proceedings of the 7th conference on USENIX Security Symposium - Volume 7
Detecting buffer overflow via automatic test input data generation
Computers and Operations Research
Grammar-based whitebox fuzzing
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Approximating edit distance in near-linear time
Proceedings of the forty-first annual ACM symposium on Theory of computing
Taint-based directed whitebox fuzzing
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Loop-extended symbolic execution on binary programs
Proceedings of the eighteenth international symposium on Software testing and analysis
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Hi-index | 0.00 |
Fuzzing is widely used to detect software vulnerabilities. Blackbox fuzzing does not require program source code. It mutates well-formed inputs to produce new ones. However, these new inputs usually do not exercise deep program semantics since the possibility that they can satisfy the conditions of a deep program state is low. As a result, blackbox fuzzing is often limited to identify vulnerabilities in input validation components of a program. Domain knowledge such as input specifications can be used to mitigate these limitations. However, it is often expensive to obtain such knowledge in practice. Whitebox fuzzing employs heavy analysis techniques, i.e., dynamic symbolic execution, to systematically generate test inputs and explore as many paths as possible. It is powerful to explore new program branches so as to identify more vulnerabilities. However, it has fundamental challenges such as unsolvable constraints and is difficult to scale to large programs due to path explosion. This paper proposes a novel fuzzing approach that aims to produce test inputs to explore deep program semantics effectively and efficiently. The fuzzing process comprises two stages. At the first stage, a traditional blackbox fuzzing approach is applied for test data generation. This process is guided by a novel test case similarity metric. At the second stage, a subset of the test inputs generated at the first stage is selected based on the test case similarity metric. Then, combination testing is applied on these selected test inputs to further generate new inputs. As a result, less redundant test inputs, i.e., inputs that just explore shallow program paths, are created at the first stage, and more distinct test inputs, i.e., inputs that explore deep program paths, are produced at the second stage. A prototype tool SimFuzz is developed and evaluated on real programs, and the experimental results are promising.