AREION: Software effort estimation based on multiple regressions with adaptive recursive data partitioning

Authors:
Yeong-Seok Seo;Doo-Hwan Bae;Ross Jeffery
Affiliations:
-;-;-
Venue:
Information and Software Technology
Year:
2013

Citing 36
Cited 0

Software engineering metrics and models

Software engineering metrics and models
The ART of Adaptive Pattern Recognition by a Self-Organizing Neural Network

Computer
Method to estimate parameter values in software prediction models

Information and Software Technology - Information and software economics
Robust regression for developing software estimation models

Journal of Systems and Software
Machine Learning Approaches to Estimating Software Development Effort

IEEE Transactions on Software Engineering
Estimating Software Project Effort Using Analogies

IEEE Transactions on Software Engineering
An assessment and comparison of common software cost estimation modeling techniques

Proceedings of the 21st international conference on Software engineering
A replicated assessment and comparison of common software cost modeling techniques

Proceedings of the 22nd international conference on Software engineering
Comparing Software Prediction Techniques Using Simulation

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Data Mining and Knowledge Discovery with Evolutionary Algorithms

Data Mining and Knowledge Discovery with Evolutionary Algorithms
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
Experience With the Accuracy of Software Maintenance Task Effort Prediction Models

IEEE Transactions on Software Engineering
Using Public Domain Metrics To Estimate Software Development Effort

METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Combining techniques to optimize effort predictions in software project management

Journal of Systems and Software
Dealing with Missing Software Project Data

METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
A Simulation Study of the Model Evaluation Criterion MMRE

IEEE Transactions on Software Engineering
Determining the Number of Clusters/Segments in Hierarchical Clustering/Segmentation Algorithms

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Reliability and Validity in Comparative Studies of Software Prediction Models

IEEE Transactions on Software Engineering
Computing LTS Regression for Large Data Sets

Data Mining and Knowledge Discovery
An empirical study of process-related attributes in segmented software cost-estimation relationships

Journal of Systems and Software
Cross-company and single-company effort models using the ISBSG database: a further replicated study

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Outlier elimination in construction of software metric models

Proceedings of the 2007 ACM symposium on Applied computing
A Systematic Review of Software Development Cost Estimation Studies

IEEE Transactions on Software Engineering
A cluster validity index for fuzzy clustering

Information Sciences: an International Journal
Replicating studies on cross- vs single-company effort models using the ISBSG Database

Empirical Software Engineering
Software project effort estimation based on multiple parametric models generated through data clustering

Journal of Computer Science and Technology
An empirical analysis of software effort estimation with outlier elimination

Proceedings of the 4th international workshop on Predictor models in software engineering
Segmented software cost estimation models based on fuzzy clustering

Journal of Systems and Software
Combining regression and estimation by analogy in a semi-parametric model for software cost estimation

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
ENNA: software effort estimation using ensemble of neural networks with associative memory

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Bayesian Network Models for Web Effort Prediction: A Comparative Study

IEEE Transactions on Software Engineering
The Impact of Lessons-Learned Sessions on Effort Estimation and Uncertainty Assessments

IEEE Transactions on Software Engineering
A pattern-based outlier detection method identifying abnormal attributes in software project data

Information and Software Technology
Improving the Accuracy of Software Effort Estimation Based on Multiple Least Square Regression Models by Estimation Error-Based Data Partitioning

APSEC '09 Proceedings of the 2009 16th Asia-Pacific Software Engineering Conference
Improvement Opportunities and Suggestions for Benchmarking

IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
Regression analisys of segmented parametric software cost estimation models using recursive clustering tool

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Context: Along with expert judgment, analogy-based estimation, and algorithmic methods (such as Function point analysis and COCOMO), Least Squares Regression (LSR) has been one of the most commonly studied software effort estimation methods. However, an effort estimation model using LSR, a single LSR model, is highly affected by the data distribution. Specifically, if the data set is scattered and the data do not sit closely on the single LSR model line (do not closely map to a linear structure) then the model usually shows poor performance. In order to overcome this drawback of the LSR model, a data partitioning-based approach can be considered as one of the solutions to alleviate the effect of data distribution. Even though clustering-based approaches have been introduced, they still have potential problems to provide accurate and stable effort estimates. Objective: In this paper, we propose a new data partitioning-based approach to achieve more accurate and stable effort estimates via LSR. This approach also provides an effort prediction interval that is useful to describe the uncertainty of the estimates. Method: Empirical experiments are performed to evaluate the performance of the proposed approach by comparing with the basic LSR approach and clustering-based approaches, based on industrial data sets (two subsets of the ISBSG (Release 9) data set and one industrial data set collected from a banking institution). Results: The experimental results show that the proposed approach not only improves the accuracy of effort estimation more significantly than that of other approaches, but it also achieves robust and stable results according to the degree of data partitioning. Conclusion: Compared with the other considered approaches, the proposed approach shows a superior performance by alleviating the effect of data distribution that is a major practical issue in software effort estimation.