Predicting Fault-Prone Software Modules in Telephone Switches
IEEE Transactions on Software Engineering
The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics
IEEE Transactions on Software Engineering
Predicting the Location and Number of Faults in Large Software Systems
IEEE Transactions on Software Engineering
Mining metrics to predict component failures
Proceedings of the 28th international conference on Software engineering
ROCR: visualizing classifier performance in R
Bioinformatics
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Predicting Defects for Eclipse
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Comments on "Data Mining Static Code Attributes to Learn Defect Predictors"
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Predicting defects using network analysis on dependency graphs
Proceedings of the 30th international conference on Software engineering
Theory of relative defect proneness
Empirical Software Engineering
Techniques for evaluating fault prediction models
Empirical Software Engineering
IEEE Transactions on Software Engineering
Why comparative effort prediction studies may be invalid
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Validation of network measures as indicators of defective modules in software systems
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Revisiting the evaluation of defect prediction models
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
How to build repeatable experiments
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Predicting faults using the complexity of code changes
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Defect prediction from static code features: current results, limitations, new approaches
Automated Software Engineering
Effort-Aware Defect Prediction Models
CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Journal of Systems and Software
Evaluating defect prediction approaches: a benchmark and an extensive comparison
Empirical Software Engineering
Proceedings of the 34th International Conference on Software Engineering
Recalling the "imprecision" of cross-project defect prediction
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
How, and why, process metrics are better
Proceedings of the 2013 International Conference on Software Engineering
Information and Software Technology
Hi-index | 0.00 |
Background: The main goal of the PROMISE repository is to enable reproducible, and thus verifiable or refutable research. Over time, plenty of data sets became available, especially for defect prediction problems. Aims: In this study, we investigate possible problems and pitfalls that occur during replication. This information can be used for future replication studies, and serve as a guideline for researchers reporting novel results. Method: We replicate two recent defect prediction studies comparing different data sets and learning algorithms, and report missing information and problems. Results: Even with access to the original data sets, replicating previous studies may not lead to the exact same results. The choice of evaluation procedures, performance measures and presentation has a large influence on the reproducibility. Additionally, we show that trivial and random models can be used to identify overly optimistic evaluation measures. Conclusions: The best way to conduct easily reproducible studies is to share all associated artifacts, e.g. scripts and programs used. When this is not an option, our results can be used to simplify the replication task for other researchers.