Local bias and its impacts on the performance of parametric estimation models

  • Authors:
  • Ye Yang;Lang Xie;Zhimin He;Qi Li;Vu Nguyen;Barry Boehm;Ricardo Valerdi

  • Affiliations:
  • Lab for Internet Software Technology, Institute of Software Chinese Academy of Sciences;Lab for Internet Software Technology, Institute of Software Chinese Academy of Sciences and Graduate University of Chinese Academy of Sciences, Beijing, China;Lab for Internet Software Technology, Institute of Software Chinese Academy of Sciences and Graduate University of Chinese Academy of Sciences, Beijing, China;University of Southern California, Los Angeles;University of Southern California, Los Angeles;University of Southern California, Los Angeles;Massachusetts Institute of Technology, Cambridge

  • Venue:
  • Proceedings of the 7th International Conference on Predictive Models in Software Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Background: Continuously calibrated and validated parametric models are necessary for realistic software estimates. However, in practice, variations in model adoption and usage patterns introduce a great deal of local bias in the resultant historical data. Such local bias should be carefully examined and addressed before the historical data can be used for calibrating new versions of parametric models. Aims: In this study, we aim at investigating the degree of such local bias in a cross-company historical dataset, and assessing its impacts on parametric estimation model's performance. Method: Our study consists of three parts: 1) defining a method for measuring and analyzing the local bias associated with individual organization data subset in the overall dataset; 2) assessing the impacts of local bias on the estimation performance of COCOMO II 2000 model; 3) performing a correlation analysis to verify that local bias can be harmful to the performance of a parametric estimation model. Results: Our results show that the local bias negatively impacts the performance of parametric model. Our measure of local bias has a positive correlation with the performance by statistical importance. Conclusion: Local calibration by using the whole multi-company data would get worse performance. The influence of multi-company data could be defined by local bias and be measured by our method.