Modeling the relationship between software effort and size using deming regression

  • Authors:
  • Nikolaos Mittas;Makrina Viola Kosti;Vasiliki Argyropoulou;Lefteris Angelis

  • Affiliations:
  • Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece

  • Venue:
  • Proceedings of the 6th International Conference on Predictive Models in Software Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Background: The relation between software effort and size has been modeled in literature as exponential, in the sense that the natural logarithm of effort is expressed as a linear function of the logarithm of size. The common approach to estimate the parameters of the linear model is ordinary least squares regression which has been extensively applied to various datasets. The least squares estimation takes into account only the error arising from the dependent variable (effort), while the measurement of independent variable (size) is considered free of errors. Aims: The basis of the study is that in practice the assumption of measuring the size without error is hardly true, since the size of a software project depends on the precision of the tool of measurement and often by the subjectivity of the rater. Moreover, the sizes of projects comprising a dataset have been measured by different measurement tools and this adds another source of variability in the independent variable. Method: In this paper, we consider a regression technique, known as Deming regression, which takes into account the error in measurement of the independent variable, the size. Deming regression is applied to four publically available datasets in order to model the linear relationship between effort and size and to compare it with ordinary least squares. Results: Accuracy measures of fitting (MAE, MdAE, MMRE, MdMRE, pred25) are improved by the Deming regression. Comparison of Absolute Errors (AE) by the Wilcoxon test shows significant difference at Conclusions: Deming regression is appropriate for datasets where the size is subject to measurement error. However some assumptions on the variances of the measurement errors are arbitrary and need to be studied. Further work is needed for using the Deming regression for effort prediction.