Using genetic programming to improve software effort estimation based on general data sets

  • Authors:
  • Martin Lefley;Martin J. Shepperd

  • Affiliations:
  • School of Design Engineering and Computing, University of Bournemouth, Poole, UK;School of Design Engineering and Computing, University of Bournemouth, Poole, UK

  • Venue:
  • GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates the use of various techniques including genetic programming, with public data sets, to attempt to model and hence estimate software project effort. The main research question is whether genetic programs can offer 'better' solution search using public domain metrics rather than company specific ones. Unlike most previous research, a realistic approach is taken, whereby predictions are made on the basis of the data available at a given date. Experiments are reported, designed to assess the accuracy of estimates made using data within and beyond a specific company. This research also offers insights into genetic programming's performance, relative to alternative methods, as a problem solver in this domain. The results do not find a clear winner but, for this data, GP performs consistently well, but is harder to configure and produces more complex models. The evidence here agrees with other researchers that companies would do well to base estimates on in house data rather than incorporating public data sets. The complexity of the GP must be weighed against the small increases in accuracy to decide whether to use it as part of any effort prediction estimation.