Semi-automatic fault localization

  • Authors:
  • Mary Jean Harrold;James Arthur Jones

  • Affiliations:
  • Georgia Institute of Technology;Georgia Institute of Technology

  • Venue:
  • Semi-automatic fault localization
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most expensive and time-consuming components of the debugging process is locating the errors or faults. To locate faults, developers must identify statements involved in failures and select suspicious statements that might contain faults. In practice, this localization is done by developers in a tedious and manual way, using only a single execution, targeting only one fault, and having a limited perspective into a large search space. The thesis of this research is that fault localization can be partially automated with the use of commonly available dynamic information gathered from test-case executions in a way that is effective, efficient, tolerant of test cases that pass but also execute the fault, and scalable to large programs that potentially contain multiple faults. The overall goal of this research is to develop effective and efficient fault localization techniques that scale to programs of large size and with multiple faults. There are three principle steps performed to reach this goal: (1) Develop practical techniques for locating suspicious regions in a program; (2) Develop techniques to partition test suites into smaller, specialized test suites to target specific faults; and (3) Evaluate the usefulness and cost of these techniques. In this dissertation, the difficulties and limitations of previous work in the area of fault-localization are investigated. These investigations informed the development of a new technique, called Tarantula, that addresses some key limitations of prior work in the area, namely effectiveness, efficiency, and practicality. Empirical evaluation of the Tarantula technique shows that it is efficient and effective for many faults. The evaluation also demonstrates that the Tarantula technique can loose effectiveness as the number of faults increases. To address the loss of effectiveness for programs with multiple faults, supporting techniques have been developed and are presented. The empirical evaluation of these supporting techniques demonstrates that they can enable effective fault localization in the presence of multiple faults. A new mode of debugging, called parallel debugging, is developed and empirical evidence demonstrates that it can provide a savings in terms of both total expense and time to delivery. A prototype visualization is provided to display the fault-localization results as well as to provide a method to interact and explore those results. Lastly, a study on the effects of the composition of test suites on fault-localization is presented.