Software engineering metrics and models
Software engineering metrics and models
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Comparing case-based reasoning classifiers for predicting high risk software components
Journal of Systems and Software
ICCBR '95 Proceedings of the First International Conference on Case-Based Reasoning Research and Development
Estimating Software Project Effort by Analogy Based on Linguistic Values
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Predicting Fault-Prone Modules with Case-Based Reasoning
ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Modeling software quality: the Software Measurement Analysis and Reliability Toolkit
ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
Dependency networks for inference, collaborative filtering, and data visualization
The Journal of Machine Learning Research
A Simulation Study of the Model Evaluation Criterion MMRE
IEEE Transactions on Software Engineering
Discovering Knowledge in Data: An Introduction to Data Mining
Discovering Knowledge in Data: An Introduction to Data Mining
An empirical study of predicting software faults with case-based reasoning
Software Quality Control
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA+
Empirical Software Engineering
Review: A systematic review of software fault prediction studies
Expert Systems with Applications: An International Journal
Stable rankings for different effort models
Automated Software Engineering
Case-based reasoning vs parametric models for software quality optimization
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
On the value of learning from defect dense components for software defect prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Evolutionary Optimization of Software Quality Modeling with Multiple Repositories
IEEE Transactions on Software Engineering
A General Software Defect-Proneness Prediction Framework
IEEE Transactions on Software Engineering
Regularities in learning defect predictors
PROFES'10 Proceedings of the 11th international conference on Product-Focused Software Process Improvement
Hi-index | 0.00 |
Background: The prediction performance of a case-based reasoning (CBR) model is influenced by the combination of the following parameters: (i) similarity function, (ii) number of nearest neighbor cases, (iii) weighting technique used for attributes, and (iv) solution algorithm. Each combination of the above parameters is considered as an instantiation of the general CBR-based prediction method. The selection of an instantiation for a new data set with specific characteristics (such as size, defect density and language) is called customization of the general CBR method. Aims: For the purpose of defect prediction, we approach the question which combinations of parameters works best at which situation. Three more specific questions were studied: (RQ1) Does one size fit all? Is one instantiation always the best? (RQ2) If not, which individual and combined parameter settings occur most frequently in generating the best prediction results? (RQ3) Are there context-specific rules to support the customization? Method: In total, 120 different CBR instantiations were created and applied to 11 data sets from the PROMISE repository. Predictions were evaluated in terms of their mean magnitude of relative error (MMRE) and percentage Pred(α) of objects fulfilling a prediction quality level α. For the third research question, dependency network analysis was performed. Results: Most frequent parameter options for CBR instantiations were neural network based sensitivity analysis (as the weighting technique), un-weighted average (as the solution algorithm), and maximum number of nearest neighbors (as the number of nearest neighbors). Using dependency network analysis, a set of recommendations for customization was provided. Conclusion: An approach to support customization is provided. It was confirmed that application of context-specific rules across groups of similar data sets is risky and produces poor results.