Automated trendline generation for accurate software effort estimation

  • Authors:
  • Karthikeyan Ponnalagu;Nanjangud Narendra

  • Affiliations:
  • IBM Research India, Bangalore, India;IBM Research India, Bangalore, India

  • Venue:
  • Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is well-known that accurate effort estimation is one of the key factors in deciding the success of a software project. However, as any project manager knows, generating accurate estimates has proven to be extremely difficult in practice. Even well-known estimation techniques such as COCOMO or SLIMare not guaranteed to work all the time. One key issue in estimation is the selection of the appropriate historical project data set as a frame of reference against which the estimation can be generated. In our experience in working with software projects in IBM, we have found this to be the most crucial deciding factor for the success of a software estimate; indeed, choosing the wrong project data set during estimation could be disastrous for the software project in question. This is because the trendlines (charts of effort vis-a-vis size) generated from the historical data determine the estimate for the software project, and wrong trendlines could result in wrong estimates.To that end, in this paper, we present an automated trendline generation technique for improving effort estimation in software projects. Our technique makes use of a novel data structure that we have designed called Estimation Key-Map, which represents project data in a multi-dimensional format. This format enables dynamic analysis and clustering of project data into appropriate subsets that can be selected as historical data for estimation of the software project in question. We present the results of validation of our technique against reported actual data, by evaluating it against a large project data set from IBM; therein, we show how our technique enables the selection of the appropriate trendline, thereby enabling more accurate effort estimates.