Knowledge in context: a strategy for expert system maintenance
AI '88 Proceedings of the second Australian joint conference on Artificial intelligence
The Detection of Fault-Prone Programs
IEEE Transactions on Software Engineering
Original Contribution: Stacked generalization
Neural Networks
C4.5: programs for machine learning
C4.5: programs for machine learning
Decision Combination in Multiple Classifier Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Induction of ripple-down rules applied to modeling large databases
Journal of Intelligent Information Systems
Machine Learning
Software metrics (2nd ed.): a rigorous and practical approach
Software metrics (2nd ed.): a rigorous and practical approach
Artificial Intelligence Review - Special issue on lazy learning
Voting over Multiple Condensed Nearest Neighbors
Artificial Intelligence Review - Special issue on lazy learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
Technical Note: Naive Bayes for Regression
Machine Learning
Comparing case-based reasoning classifiers for predicting high risk software components
Journal of Systems and Software
Comparing Software Prediction Techniques Using Simulation
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Balancing Misclassification Rates in Classification-TreeModels of Software Quality
Empirical Software Engineering
IEEE Transactions on Pattern Analysis and Machine Intelligence
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois
ALT '96 Proceedings of the 7th International Workshop on Algorithmic Learning Theory
Understanding the Nature of Software Evolution
ICSM '03 Proceedings of the International Conference on Software Maintenance
Automating the Analysis of Voting Systems
ISSRE '03 Proceedings of the 14th International Symposium on Software Reliability Engineering
The Necessity of Assuring Quality in Software Measurement Data
METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
The Effects of Fault Counting Methods on Fault Model Quality
COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Volume 01
A Novel Method for Early Software Quality Prediction Based on Support Vector Machine
ISSRE '05 Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Lowering variance of decisions by using artificial neural network portfolios
Neural Computation
Software metrics reduction for fault-proneness prediction of software modules
NPC'10 Proceedings of the 2010 IFIP international conference on Network and parallel computing
Assessing the maintainability of software product line feature models using structural metrics
Software Quality Control
Hi-index | 0.00 |
When building software quality models, the approach often consists of training data mining learners on a single fit dataset. Typically, this fit dataset contains software metrics collected during a past release of the software project that we want to predict the quality of. In order to improve the predictive accuracy of such quality models, it is common practice to combine the predictive results of multiple learners to take advantage of their respective biases. Although multi-learner classifiers have been proven to be successful in some cases, the improvement is not always significant because the information in the fit dataset sometimes can be insufficient. We present an innovative method to build software quality models using majority voting to combine the predictions of multiple learners induced on multiple training datasets. To our knowledge, no previous study in software quality has attempted to take advantage of multiple software project data repositories which are generally spread across the organization. In a large scale empirical study involving seven real-world datasets and seventeen learners, we show that, on average, combining the predictions of one learner trained on multiple datasets significantly improves the predictive performance compared to one learner induced on a single fit dataset. We also demonstrate empirically that combining multiple learners trained on a single training dataset does not significantly improve the average predictive accuracy compared to the use of a single learner induced on a single fit dataset.