A comparison of case-based reasoning approaches
Proceedings of the 11th international conference on World Wide Web
Issues on the Effective Use of CBR Technology for Software Project Prediction
ICCBR '01 Proceedings of the 4th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
The New Review of Hypermedia and Multimedia - Hypermedia and the world wide web
Combining techniques to optimize effort predictions in software project management
Journal of Systems and Software
A Simulation Study of the Model Evaluation Criterion MMRE
IEEE Transactions on Software Engineering
Proceedings of the 16th international conference on World Wide Web
Cross-company vs. single-company web effort models using the Tukutuku database: An extended study
Journal of Systems and Software
Confidence in software cost estimation results based on MMRE and PRED
Proceedings of the 4th international workshop on Predictor models in software engineering
Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
A constrained regression technique for cocomo calibration
Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Web Cost Estimation and Productivity Benchmarking
Software Engineering
What's up with software metrics? - A preliminary mapping study
Journal of Systems and Software
Applying support vector regression for web effort estimation using a cross-company dataset
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Using Support Vector Regression for Web Development Effort Estimation
IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
Using Tabu Search to Estimate Software Development Effort
IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
A replicated study comparing web effort estimation techniques
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
ICSP'08 Proceedings of the Software process, 2008 international conference on Making globally distributed software development a success story
Investigating the use of Support Vector Regression for web effort estimation
Empirical Software Engineering
Investigating effort prediction of web-based applications using CBR on the ISBSG dataset
EASE'10 Proceedings of the 14th international conference on Evaluation and Assessment in Software Engineering
Web effort estimation: the value of cross-company data set compared to single-company data set
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
A model-driven measurement procedure for sizing web applications: design, automation and validation
MODELS'07 Proceedings of the 10th international conference on Model Driven Engineering Languages and Systems
Using CBR and CART to predict maintainability of relational database-driven software applications
Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering
Software development cost estimation using similarity difference between software attributes
Proceedings of the 2013 International Conference on Information Systems and Design of Communication
Hi-index | 0.00 |
The need for accurate software prediction systems increases as software becomes much larger and more complex. A variety of techniques have been proposed, however, none has proved consistently accurate and there is still much uncertainty as to what technique suits which type of prediction problem. We believe that the underlying characteristics - size, number of features, type of distribution, etc. - of the dataset influence the choice of the prediction system to be used. In previous work, it has proved difficult to obtain significant results over small datasets. Consequently we required large validation datasets, moreover, we wished to control the characteristics of such datasets in order to systematically explore the relationship between accuracy, choice of prediction system and dataset characteristic. Our solution has been to simulate data allowing both control and the possibility of large (1000) validation cases. In this paper we compared regression, rule induction and nearest neighbour (a form of case based reasoning). The results suggest that there are significant differences depending upon the characteristics of the dataset. Consequently researchers should consider prediction context when evaluating competing prediction systems. We also observed that the more "messy" the data and the more complex the relationship with the dependent variable the more variability in the results. This became apparent since we sampled two different training sets from each simulated population of data. In the more complex cases we observed significantly different results depending upon the training set. This suggests that researchers will need to exercise caution when comparing different approaches and utilise procedures such as bootstrapping in order to generate multiple samples for training purposes.