Using Simulation to Evaluate Prediction Techniques

Authors:
Martin Shepperd;Gada Kadoda
Affiliations:
-;-
Venue:
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Year:
2001

Citing 0
Cited 24

A comparison of case-based reasoning approaches

Proceedings of the 11th international conference on World Wide Web
Issues on the Effective Use of CBR Technology for Software Project Prediction

ICCBR '01 Proceedings of the 4th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Web hypermedia cost estimation: further assessment and comparison of cost estimation modelling techniques

The New Review of Hypermedia and Multimedia - Hypermedia and the world wide web
Combining techniques to optimize effort predictions in software project management

Journal of Systems and Software
A Simulation Study of the Model Evaluation Criterion MMRE

IEEE Transactions on Software Engineering
Effort estimation: how valuable is it for a web company to use a cross-company data set, compared to using its own single-company data set?

Proceedings of the 16th international conference on World Wide Web
Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

Journal of Systems and Software
Confidence in software cost estimation results based on MMRE and PRED

Proceedings of the 4th international workshop on Predictor models in software engineering
Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
A constrained regression technique for cocomo calibration

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Web Cost Estimation and Productivity Benchmarking

Software Engineering
What's up with software metrics? - A preliminary mapping study

Journal of Systems and Software
Applying support vector regression for web effort estimation using a cross-company dataset

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Using Support Vector Regression for Web Development Effort Estimation

IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
Using Tabu Search to Estimate Software Development Effort

IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
A replicated study comparing web effort estimation techniques

WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Using process simulation to assess the test design effort reduction of a model-based testing approach

ICSP'08 Proceedings of the Software process, 2008 international conference on Making globally distributed software development a success story
Investigating the use of Support Vector Regression for web effort estimation

Empirical Software Engineering
Measures and techniques for effort estimation of web applications: an empirical study based on a single-company dataset

Journal of Web Engineering
Investigating effort prediction of web-based applications using CBR on the ISBSG dataset

EASE'10 Proceedings of the 14th international conference on Evaluation and Assessment in Software Engineering
Web effort estimation: the value of cross-company data set compared to single-company data set

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
A model-driven measurement procedure for sizing web applications: design, automation and validation

MODELS'07 Proceedings of the 10th international conference on Model Driven Engineering Languages and Systems
Using CBR and CART to predict maintainability of relational database-driven software applications

Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering
Software development cost estimation using similarity difference between software attributes

Proceedings of the 2013 International Conference on Information Systems and Design of Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

The need for accurate software prediction systems increases as software becomes much larger and more complex. A variety of techniques have been proposed, however, none has proved consistently accurate and there is still much uncertainty as to what technique suits which type of prediction problem. We believe that the underlying characteristics - size, number of features, type of distribution, etc. - of the dataset influence the choice of the prediction system to be used. In previous work, it has proved difficult to obtain significant results over small datasets. Consequently we required large validation datasets, moreover, we wished to control the characteristics of such datasets in order to systematically explore the relationship between accuracy, choice of prediction system and dataset characteristic. Our solution has been to simulate data allowing both control and the possibility of large (1000) validation cases. In this paper we compared regression, rule induction and nearest neighbour (a form of case based reasoning). The results suggest that there are significant differences depending upon the characteristics of the dataset. Consequently researchers should consider prediction context when evaluating competing prediction systems. We also observed that the more "messy" the data and the more complex the relationship with the dependent variable the more variability in the results. This became apparent since we sampled two different training sets from each simulated population of data. In the more complex cases we observed significantly different results depending upon the training set. This suggests that researchers will need to exercise caution when comparing different approaches and utilise procedures such as bootstrapping in order to generate multiple samples for training purposes.