BegBunch: benchmarking for C bug detection tools

Authors:
Cristina Cifuentes;Christian Hoermann;Nathan Keynes;Lian Li;Simon Long;Erica Mealy;Michael Mounteney;Bernhard Scholz
Affiliations:
Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia;Sun Microsystems Laboratories, Brisbane, Australia
Venue:
Proceedings of the 2nd International Workshop on Defects in Large Software Systems: Held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009)
Year:
2009

Citing 7
Cited 2

Information Retrieval

Information Retrieval
Improving Security Using Extensible Lightweight Static Analysis

IEEE Software
Using benchmarking to advance research: a challenge to software engineering

Proceedings of the 25th International Conference on Software Engineering
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Testing static analysis tools using exploitable buffer overflows from open source code

Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering
Parfait: designing a scalable bug checker

Proceedings of the 2008 workshop on Static analysis
On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement

Practical and effective symbolic analysis for buffer overflow detection

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Static deep error checking in large system applications using parfait

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Benchmarks for bug detection tools are still in their infancy. Though in recent years various tools and techniques were introduced, little effort has been spent on creating a benchmark suite and a harness for a consistent quantitative and qualitative performance measurement. For assessing the performance of a bug detection tool and determining which tool is better than another for the type of code to be looked at, the following questions arise: 1) how many bugs are correctly found, 2) what is the tool's average false positive rate, 3) how many bugs are missed by the tool altogether, and 4) does the tool scale. In this paper we present our contribution to the C bug detection community: two benchmark suites that allow developers and users to evaluate accuracy and scalability of a given tool. The two suites contain buggy, mature open source code; bugs are representative of "real world" bugs. A harness accompanies each benchmark suite to compute automatically qualitative and quantitative performance of a bug detection tool. BegBunch has been tested to run on the Solaris™, Mac OS X and Linux operating systems. We show the generality of the harness by evaluating it with our own Parfait and three publicly available bug detection tools developed by others.