F007: finding rediscovered faults from the field using function-level failed traces of software in the field

Authors:
Syed Shariyar Murtaza;Mechelle Gittens;Zude Li;Nazim H. Madhavji
Affiliations:
University of Western Ontario, London, Canada;University of Western Ontario, London, Canada;University of Western Ontario, London, Canada;University of Western Ontario, London, Canada
Venue:
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Year:
2010

Citing 29
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria

ICSE '94 Proceedings of the 16th international conference on Software engineering
Diagnosing Rediscovered Software Problems Using Symptoms

IEEE Transactions on Software Engineering
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Automated support for classifying software failure reports

Proceedings of the 25th International Conference on Software Engineering
Determining the Distribution of Maintenance Categories: Survey versus Measurement

Empirical Software Engineering
Software Reliability from the Customer View

Computer
Active learning for automatic classification of software behavior

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Quickly Finding Known Software Problems via Automated Symptom Matching

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
SOBER: statistical model-based bug localization

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact

Empirical Software Engineering
The Vital Few Versus the Trivial Many: Examining the Pareto Principle for Software

COMPSAC '05 Proceedings of the 29th Annual International Computer Software and Applications Conference - Volume 01
An Empirical Study of Software Maintenance of a Web-Based Java Application

ICSM '05 Proceedings of the 21st IEEE International Conference on Software Maintenance
Empirical evaluation of the tarantula automatic fault-localization technique

Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
Effective program debugging based on execution slices and inter-block data dependency

Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
Failure proximity: a fault localization-based approach

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Discriminative pattern mining in software fault detection

Proceedings of the 3rd international workshop on Software quality assurance
Data Preparation for Data Mining Using SAS

Data Preparation for Data Mining Using SAS
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Automated known problem diagnosis with event traces

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Path-based faliure and evolution management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks

IEEE Transactions on Software Engineering
Trace anomalies as precursors of field failures: an empirical study

Empirical Software Engineering
Effective Fault Localization using Code Coverage

COMPSAC '07 Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01
A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems

Expert Systems with Applications: An International Journal
Automatic software fault diagnosis by exploiting application signatures

LISA'08 Proceedings of the 22nd conference on Large installation system administration conference
HOLMES: Effective statistical debugging via efficient path profiling

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Lightweight defect localization for java

ECOOP'05 Proceedings of the 19th European conference on Object-Oriented Programming

Diagnosing new faults using mutants and prior faults (NIER track)

Proceedings of the 33rd International Conference on Software Engineering
Using entropy measures for comparison of software traces

Information Sciences: an International Journal
An empirical study on the use of mutant traces for diagnosis of faults in deployed systems

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Studies show that approximately 50% to 90% of the failures reported from the field are rediscoveries of previous faults. Also, approximately 80% of the failures originate from approximately 20% of the code. Despite this identification of the origin of the failures in system code remains an arduous activity, and consumes substantial resources. Prior fault discovery techniques for field traces either require many pass-fail traces, discover only crashing failures, or identify faulty coarse grain code such as files as the source of the fault. This paper describes a new method (F007) that focuses on identifying finer grain faulty code (faulty functions) from only failed traces of deployed software. F007 extracts patterns of function-calls from a historical collection of only function-level failed traces, and then trains decision trees on the extracted function-call patterns for each known faulty function. A ranked list of faulty functions is then predicted by F007 for a new failure trace based on the probability of fault proneness obtained via decision trees. Our case study on the Siemens suite shows that F007: (a) can identify rediscovered faulty functions (with new or old faults) with 60--86% accuracy, (b) needs to examine approximately 5--10% of the code for the Siemens suite, and (c) can discover the faulty functions in every new failed trace by using a small collection of previous failed traces. Thus, F007 can correctly identify the faulty functions for the majority (80%-90%) of (field) failures with the knowledge of a fault in a small percentage (20%) of functions.