An ant colony optimization algorithm to improve software quality prediction models: Case of class stability

  • Authors:
  • D. Azar;J. Vybihal

  • Affiliations:
  • Department of Computer Science and Mathematics, Lebanese American University, Byblos 1h401 2010, Lebanon;McGill University, School of Computer Science, 3480 University St., Montreal, Quebec, Canada H3A 2A7

  • Venue:
  • Information and Software Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context: Assessing software quality at the early stages of the design and development process is very difficult since most of the software quality characteristics are not directly measurable. Nonetheless, they can be derived from other measurable attributes. For this purpose, software quality prediction models have been extensively used. However, building accurate prediction models is hard due to the lack of data in the domain of software engineering. As a result, the prediction models built on one data set show a significant deterioration of their accuracy when they are used to classify new, unseen data. Objective: The objective of this paper is to present an approach that optimizes the accuracy of software quality predictive models when used to classify new data. Method: This paper presents an adaptive approach that takes already built predictive models and adapts them (one at a time) to new data. We use an ant colony optimization algorithm in the adaptation process. The approach is validated on stability of classes in object-oriented software systems and can easily be used for any other software quality characteristic. It can also be easily extended to work with software quality predictive problems involving more than two classification labels. Results: Results show that our approach out-performs the machine learning algorithm C4.5 as well as random guessing. It also preserves the expressiveness of the models which provide not only the classification label but also guidelines to attain it. Conclusion: Our approach is an adaptive one that can be seen as taking predictive models that have already been built from common domain data and adapting them to context-specific data. This is suitable for the domain of software quality since the data is very scarce and hence predictive models built from one data set is hard to generalize and reuse on new data.