Investigating the use of Support Vector Regression for web effort estimation

Authors:
Anna Corazza;Sergio Di Martino;Filomena Ferrucci;Carmine Gravino;Emilia Mendes
Affiliations:
University of Napoli "Federico II", Naples, Italy 80126;University of Napoli "Federico II", Naples, Italy 80126;University of Salerno, Fisciano, Italy 84084;University of Salerno, Fisciano, Italy 84084;The University of Auckland, Auckland, New Zealand 92019
Venue:
Empirical Software Engineering
Year:
2011

Citing 25
Cited 4

Software engineering metrics and models

Software engineering metrics and models
Support-Vector Networks

Machine Learning
A Procedure for Analyzing Unbalanced Datasets

IEEE Transactions on Software Engineering
Making large-scale support vector machine learning practical

Advances in kernel methods
Bayesian Analysis of Empirical Software Engineering Cost Models

IEEE Transactions on Software Engineering
A replicated assessment and comparison of common software cost modeling techniques

Proceedings of the 22nd international conference on Software engineering
Case Studies for Method and Tool Evaluation

IEEE Software
A Comparative Study of Cost Estimation Models for Web Hypermedia Applications

Empirical Software Engineering
A meta-model for software development resource expenditures

ICSE '81 Proceedings of the 5th international conference on Software engineering
Using Simulation to Evaluate Prediction Techniques

METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
A tutorial on support vector regression

Statistics and Computing
Further Comparison of Cross-Company and Within-Company Effort Estimation Models for Web Applications

METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
Investigating Web size metrics for early Web cost estimation

Journal of Systems and Software
An Experimental Investigation of Formality in UML-Based Development

IEEE Transactions on Software Engineering
Effort estimation modeling techniques: a case study for web applications

ICWE '06 Proceedings of the 6th international conference on Web engineering
Cross versus Within-Company Cost Estimation Studies: A Systematic Review

IEEE Transactions on Software Engineering
Letters: Estimation of software project effort with support vector regression

Neurocomputing
Comparing Size Measures for Predicting Web Application Development Effort: A Case Study

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Software Effort Estimation using Machine Learning Techniques with Robust Confidence Intervals

HIS '07 Proceedings of the 7th International Conference on Hybrid Intelligent Systems
Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

Journal of Systems and Software
A GA-based feature selection and parameters optimization for support vector regression applied to software effort estimation

Proceedings of the 2008 ACM symposium on Applied computing
The Use of Bayesian Networks for Web Effort Estimation: Further Investigation

ICWE '08 Proceedings of the 2008 Eighth International Conference on Web Engineering
Bayesian Network Models for Web Effort Prediction: A Comparative Study

IEEE Transactions on Software Engineering
Applying support vector regression for web effort estimation using a cross-company dataset

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
A systematic review of cross- vs. within- company cost estimation studies

EASE'06 Proceedings of the 10th international conference on Evaluation and Assessment in Software Engineering

How effective is Tabu search to configure support vector regression for effort estimation?

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Systematic literature review of machine learning based software development effort estimation models

Information and Software Technology
Search-based approaches for software development effort estimation

Proceedings of the 12th International Conference on Product Focused Software Development and Process Improvement
Revisiting software development effort estimation based on early phase development activities

Proceedings of the 10th Working Conference on Mining Software Repositories

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support Vector Regression (SVR) is a new generation of Machine Learning algorithms, suitable for predictive data modeling problems. The objective of this paper is twofold: first, to investigate the effectiveness of SVR for Web effort estimation using a cross-company dataset; second, to compare different SVR configurations looking at the one that presents the best performance. In particular, we took into account three variables' preprocessing strategies (no-preprocessing, normalization, and logarithmic), in combination with two different dependent variables (effort and inverse effort). As a result, SVR was applied using six different data configurations. Moreover, to understand the suitability of kernel functions to handle non-linear problems, SVR was applied without a kernel, and in combination with the Radial Basis Function (RBF) and the Polynomial kernels, thus obtaining 18 different SVR configurations. To identify, for each configuration, which were the best values for each of the parameters we defined a procedure based on a leave-one-out cross-validation approach. The dataset employed was the Tukutuku database, which has been adopted in many previous Web effort estimation studies. Three different training and test set splits were used, including respectively 130 and 65 projects. The SVR-based predictions were also benchmarked against predictions obtained using Manual StepWise Regression and Case-Based Reasoning. Our results showed that the configuration corresponding to the logarithmic features' preprocessing, in combination with the RBF kernel provided the best results for all three data splits. In addition, SVR provided significantly superior prediction accuracy than all the considered benchmarking techniques.