Empirical Assessment of a Software Metric: The Information Content of Operators

  • Authors:
  • Taghi M. Khoshgoftaar;Edward B. Allen

  • Affiliations:
  • Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431 taghi@cse.fau.edu;Mississippi State University, Mississippi edward.allen@computer.org

  • Venue:
  • Software Quality Control
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an empirical case study that predicted faults in modules based on the total information content of the operators. This metric is closely related to Harrison's average information content classification (AICC), which is the entropy of the operators. Most information theory-based metrics proposed in the literature have not been subjected to empirical predictive studies of real-world software systems. In contrast, this study shows that a simple information theory-based metric can be more useful for prediction of software quality than comparable metrics based on counts in the context of a commercial software development organization.Three models were considered, all based on operators as an abstraction of software. The model based on information content of the operators made more accurate predictions than two similar models based on the number of operators and the number of unique operators. The purpose of this paper is a fair comparison of the three metrics, rather than developing an optimal model. We have long advocated multivariate models for industrial use. The case study considered three large commercial systems, written in assembly language, and developed consecutively by professional programmers. The first system was used to estimate parameters of the models. The subsequent two were used to evaluate the accuracy of model predictions.