A systematic model building process for predicting actionable static analysis alerts

  • Authors:
  • Laurie Williams;Sarah Smith Heckman

  • Affiliations:
  • North Carolina State University;North Carolina State University

  • Venue:
  • A systematic model building process for predicting actionable static analysis alerts
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automated static analysis tools can identify potential source code anomalies, like null pointers, buffer overflows, and unclosed streams that could lead to field failures. These anomalies, which we call alerts, require inspection by a developer to determine if the alert is important enough to fix. Actionable alert identification techniques can supplement automated static analysis tools by classifying or prioritizing the alerts generated by automated static analysis such that the likelihood of a developer inspecting actionable alerts first is increased. By classifying and prioritizing actionable static analysis alerts, the developer will focus his or her time on inspecting and fixing actionable alerts rather than inspecting and suppressing unactionable alerts. The goal of my research is to reduce inspection time by accurately predicting actionable and unactionable alerts when using static analysis by creating and validating a systematic actionable alert identification model . The Systematic Actionable Alert Identification (SAAI) process uses machine learning to identify actionable alerts. Investigation of the following three hypotheses will inform the goal of my research: (1) Hypothesis 1: The artifact characteristics of an alert and the surrounding source code are predictive of the actionability of an alert. (2) Hypothesis 2: A systematic actionable alert identification technique using machine learning can accurately identify actionable alerts. (3) Hypothesis 3: A systematic actionable alert identification technique using machine learning is project specific. A benchmark, FAULTBENCH, provides the evaluation framework for the proposed SAAI model building process and comparison with other actionable alert identification techniques. The dissertation presents a feasibility study and three empirical studies evaluating the hypotheses above. The feasibility study evaluates an adaptive actionable alert identification technique that utilizes the alert’s type and code location in addition to developer feedback to prioritize actionable alerts. The first empirical study investigates hypotheses 1–3 using FAULTBENCH on 15 SAAI models generated on five treatments for each of three subject programs. The treatments considered different grouping of alerts within revisions to train and test SAAI. The second empirical study is a comparative evaluation of the generated SAAI models with other actionable alert identification techniques in further evaluation of Hypothesis 2. Additionally, an empirical user study was conducted where students in the senior capstone project course used a custom SAAI model during development of their software project. Selection of predictive artifact characteristics as part of the SAAI process suggests the acceptance of hypothesis 1. All but four of the 58 artifact characteristics used to build SAAI models were in one or more of the artifact characteristics subsets. The SAAI model identified actionable and unactionable alerts with greater than 90% accuracy for eight of the 15 FAULTBENCH subject treatments. Comparing SAAI models with other actionable alert identification techniques from literature found that SAAI models had the highest accuracy for 11 of the 15 treatments when classifying the full alert sets. Both of the above results support hypothesis 2. Due to accuracies greater than 90% when applying artifact characteristic subsets and machine learning algorithms for one subject program to another subject program, hypothesis 3 is not supported on the evaluated subject programs. The contributions of this work are as follows: (1) A systematic actionable alert identification model building process to predict actionable and unactionable automated static analysisalerts; (2) A benchmark, FAULTBENCH, for evaluating and comparing actionable alert identification techniques; and (3) A comparative evaluation of systematic actionable alert identification models with other actionable alert identification techniques from literature.