Defect Data Analysis Based on Extended Association Rule Mining

Authors:
Shuji Morisaki;Akito Monden;Tomoko Matsumura;Haruaki Tamada;Ken-ichi Matsumoto
Affiliations:
Nara Institute of Science and Technology, Japan;Nara Institute of Science and Technology, Japan;Nara Institute of Science and Technology, Japan;Nara Institute of Science and Technology, Japan;Nara Institute of Science and Technology, Japan
Venue:
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Year:
2007

Citing 7
Cited 4

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Mining optimized association rules for numeric attributes

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining web logs for prediction models in WWW caching and prefetching

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent-subsequence-based prediction of outer membrane proteins

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Software Defect Association Mining and Defect Correction Effort Prediction

IEEE Transactions on Software Engineering
Characterization of runaway software projects using association rule mining

PROFES'06 Proceedings of the 7th international conference on Product-Focused Software Process Improvement

Weight similarity measurement model based, object oriented approach for bug databases mining to detect similar and duplicate bugs

Proceedings of the International Conference on Advances in Computing, Communication and Control
Creating Process-Agents incrementally by mining process asset library

Information Sciences: an International Journal
The MSR cookbook: mining a decade of research

Proceedings of the 10th Working Conference on Mining Software Repositories
Comparison and evaluation of source code mining tools and techniques: A qualitative approach

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an empirical study to reveal rules associated with defect correction effort. We defined defect correction effort as a quantitative (ratio scale) variable, and extended conventional (nominal scale based) association rule mining to directly handle such quantitative variables. An extended rule describes the statistical characteristic of a ratio or interval scale variable in the consequent part of the rule by its mean value and standard deviation so that conditions producing distinctive statistics can be discovered. As an analysis target, we collected various attributes of about 1,200 defects found in a typical medium-scale, multi-vendor (distance development) information system development project in Japan. Our findings based on extracted rules include: (1)Defects detected in coding/unit testing were easily corrected (less than 7% of mean effort) when they are related to data output or validation of input data. (2)Nevertheless, they sometimes required much more effort (lift of standard deviation was 5.845) in case of low reproducibility, (3)Defects introduced in coding/unit testing often required large correction effort (mean was 12.596 staff-hours and standard deviation was 25.716) when they were related to data handing. From these findings, we confirmed that we need to pay attention to types of defects having large mean effort as well as those having large standard deviation of effort since such defects sometimes cause excess effort.