An investigation of the effect of module size on defect prediction using static measures

Authors:
A. Günes Koru;Hongfang Liu
Affiliations:
University of Maryland, Baltimore County - UMBC, Baltimore, MD;University of Maryland, Baltimore County - UMBC, Baltimore, MD
Venue:
PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Year:
2005

Citing 6
Cited 12

Derivation and validation of software metrics

Derivation and validation of software metrics
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics

IEEE Transactions on Software Engineering
Experience with identifying and characterizing problem-prone modules telecommunication software systems

Journal of Systems and Software
Early Quality Prediction: A Case Study in Telecommunications

IEEE Software
The Effects of Fault Counting Methods on Fault Model Quality

COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Volume 01

Identifying and characterizing change-prone classes in two large-scale open-source products

Journal of Systems and Software
Towards a generic model for software quality prediction

Proceedings of the 6th international workshop on Software quality
Comparing negative binomial and recursive partitioning models for fault prediction

Proceedings of the 4th international workshop on Predictor models in software engineering
Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem

Information Sciences: an International Journal
Review: A systematic review of software fault prediction studies

Expert Systems with Applications: An International Journal
Data mining source code for locating software bugs: A case study in telecommunication industry

Expert Systems with Applications: An International Journal
Fair and balanced?: bias in bug-fix datasets

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Evaluating the Impact of UML Modeling on Software Quality: An Industrial Case Study

MODELS '09 Proceedings of the 12th International Conference on Model Driven Engineering Languages and Systems
Comparing the effectiveness of several modeling methods for fault prediction

Empirical Software Engineering
Review: Software fault prediction: A literature review and current trends

Expert Systems with Applications: An International Journal
Software fault prediction with object-oriented metrics based artificial immune recognition system

PROFES'07 Proceedings of the 8th international conference on Product-Focused Software Process Improvement
Software defect prediction using Bayesian networks

Empirical Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

We used several machine learning algorithms to predict the defective modules in five NASA products, namely, CM1, JM1, KC1, KC2, and PC1. A set of static measures were employed as predictor variables. While doing so, we observed that a large portion of the modules were small, as measured by lines of code (LOC). When we experimented on the data subsets created by partitioning according to module size, we obtained higher prediction performance for the subsets that include larger modules. We also performed defect prediction using class-level data for KC1 rather than the method-level data. In this case, the use of class-level data resulted in improved prediction performance compared to using method-level data. These findings suggest that quality assurance activities can be guided even better if defect prediction is performed by using data that belong to larger modules.