Using chronological splitting to compare cross- and single-company effort models: further investigation

  • Authors:
  • Chris Lokan;Emilia Mendes

  • Affiliations:
  • Canberra ACT, Australia;University of Auckland, Auckland, New Zealand

  • Venue:
  • ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Numerous studies have used historical datasets to build and validate models for estimating software development effort. Very few used a chronological split (where projects' end dates are used so that training sets only contain projects that were completed before the start date of each project in the validation set), and only one compared chronological split to random split. Therefore the aim of this study is to investigate further and compare the use of chronological and random splitting. We do so in the context of comparing cross-company and singlecompany models for effort estimation. We used 450 single-company projects and 741 cross-company projects from the ISBSG Release 10 repository, and estimates were obtained using manual stepwise regression. We found that with these data the use of chronological splitting, and different splitting dates, did not affect prediction accuracy. We were not able to obtain a converging set of findings when comparing cross- to single-company predictions given that different accuracy measures presented contradictory results.