Training on errors experiment to detect fault-prone software modules by spam filter

Authors:
Osamu Mizuno;Tohru Kikuno
Affiliations:
Osaka University, Suita, Japan;Osaka University, Suita, Japan
Venue:
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Year:
2007

Citing 15
Cited 8

Experimentation in software engineering: an introduction

Experimentation in software engineering: an introduction
An empirical evaluation of fault-proneness models

Proceedings of the 24th International Conference on Software Engineering
Controlling Overfitting in Classification-Tree Models ofSoftware Quality

Empirical Software Engineering
CCFinder: a multilinguistic token-based code clone detection system for large scale source code

IEEE Transactions on Software Engineering
Assessing the applicability of fault-proneness models across object-oriented software projects

IEEE Transactions on Software Engineering
Predicting Fault-Prone Software Modules in Embedded Systems with Classification Trees

HASE '99 The 4th IEEE International Symposium on High-Assurance Systems Engineering
Software Quality Classification Modeling Using The SPRINT Decision Tree Algorithm

ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study

Empirical Software Engineering
Spam Filtering using a Markov Random Field Model with Variable Weighting Schemas

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Combining winnow and orthogonal sparse bigrams for incremental spam filtering

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Comparing Fault-Proneness Estimation Models

ICECCS '05 Proceedings of the 10th IEEE International Conference on Engineering of Complex Computer Systems
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Analyzing Software Quality with Limited Fault-Proneness Defect Data

HASE '05 Proceedings of the Ninth IEEE International Symposium on High-Assurance Systems Engineering
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Spam Filter Based Approach for Finding Fault-Prone Software Modules

MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories

An extension of fault-prone filtering using precise training and a dynamic threshold

Proceedings of the 2008 international working conference on Mining software repositories
Prediction of Fault-Prone Software Modules Using a Generic Text Discriminator

IEICE - Transactions on Information and Systems
Fault-prone module detection using large-scale text features based on spam filtering

Empirical Software Engineering
An integrated approach to detect fault-prone modules using complexity and text feature metrics

AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Do comments explain codes adequately?: investigation by text filtering

Proceedings of the 8th Working Conference on Mining Software Repositories
Can faulty modules be predicted by warning messages of static code analyzer?

Advances in Software Engineering - Special issue on Software Quality Assurance Methodologies and Techniques
Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
Predicting method crashes with bytecode operations

Proceedings of the 6th India Software Engineering Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics. Such approaches, however, have difficulties in collecting the metrics and constructing mathematical models based on the metrics. In order to mitigate such difficulties, we propose a novel approach for detecting fault-prone modules using a spam filtering technique, named Fault-Prone Filtering. Because of the increase of needs for spam e-mail detection, the spam filtering technique has been progressed as a convenient and effective technique for text mining. In our approach, fault-prone modules are detected in a way that the source code modules are considered as text files and are applied to the spam filter directly. This paper describes the training on errors procedure to apply fault-prone filtering in practice. Since no pre-training is required, this procedure can be applied to actual development field immediately. In order to show the usefulness of our approach, we conducted an experiment using a large source code repository of Java based open source project. The result of experiment shows that our approach can classify about 85% of software modules correctly. The result also indicates that fault-prone modules can be detected relatively low cost at an early stage.