Advances in software inspections
IEEE Transactions on Software Engineering
Software reliability: measurement, prediction, application
Software reliability: measurement, prediction, application
Understanding and Controlling Software Costs
IEEE Transactions on Software Engineering
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning Approaches to Estimating Software Development Effort
IEEE Transactions on Software Engineering
The mythical man-month (anniversary ed.)
The mythical man-month (anniversary ed.)
Estimating Software Project Effort Using Analogies
IEEE Transactions on Software Engineering
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Predicting Fault Incidence Using Software Change History
IEEE Transactions on Software Engineering
Software evolution: code delta and code churn
Journal of Systems and Software - Special issue on software maintenance
Software Verification and Validation for Practitioners and Managers, Second Edition
Software Verification and Validation for Practitioners and Managers, Second Edition
Software Metrics: A Rigorous and Practical Approach
Software Metrics: A Rigorous and Practical Approach
Elements of Software Science (Operating and programming systems series)
Elements of Software Science (Operating and programming systems series)
Proceedings of the 24th International Conference on Software Engineering
Safe and Simple Software Cost Analysis
IEEE Software
Complexity Measure Evaluation and Selection
IEEE Transactions on Software Engineering
Proceedings of the 17th IEEE international conference on Automated software engineering
What We Have Learned About Fighting Defects
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Analogy-Based Practical Classification Rules for Software Quality Estimation
Empirical Software Engineering
Developing Fault Predictors for Evolving Software Systems
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
Noise Identification with the k-Means Algorithm
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Static analysis tools as early indicators of pre-release defect density
Proceedings of the 27th international conference on Software engineering
Feature subset selection can improve software cost estimation accuracy
PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods
IEEE Transactions on Software Engineering
Looking for bugs in all the right places
Proceedings of the 2006 international symposium on Software testing and analysis
Predicting fault-prone components in a java legacy system
Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Cross versus Within-Company Cost Estimation Studies: A Systematic Review
IEEE Transactions on Software Engineering
Automating algorithms for the identification of fault-prone files
Proceedings of the 2007 international symposium on Software testing and analysis
Architecture-Based Software Reliability: Why Only a Few Parameters Matter?
COMPSAC '07 Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Fault Prediction using Early Lifecycle Data
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Proceedings of the 30th international conference on Software engineering
Implications of ceiling effects in defect predictors
Proceedings of the 4th international workshop on Predictor models in software engineering
Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems
IEEE Transactions on Software Engineering
Theory of relative defect proneness
Empirical Software Engineering
IEEE Transactions on Software Engineering
Weighted proportional k-interval discretization for naive-Bayes classifiers
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Validation of network measures as indicators of defective modules in software systems
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
How to build repeatable experiments
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Merits of using repository metrics in defect prediction for open source projects
FLOSS '09 Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development
Cross-project defect prediction: a large scale experiment on data vs. domain vs. process
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Reducing false alarms in software defect prediction by decision threshold optimization
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Design-level metrics estimation based on code metrics
Proceedings of the 2010 ACM Symposium on Applied Computing
Automatically finding the control variables for complex system behavior
Automated Software Engineering
Defect prediction from static code features: current results, limitations, new approaches
Automated Software Engineering
When to use data from other projects for effort estimation
Proceedings of the IEEE/ACM international conference on Automated software engineering
Information and Software Technology
Usage of multiple prediction models based on defect categories
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Towards identifying software project clusters with regard to defect prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Better, faster, and cheaper: what is better software?
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics
An industrial case study of classifier ensembles for locating software defects
Software Quality Control
Transfer learning for cross-company software defect prediction
Information and Software Technology
Sample-based software defect prediction with active and semi-supervised learning
Automated Software Engineering
Guest editorial: learning to organize testing
Automated Software Engineering
An investigation on the feasibility of cross-project defect prediction
Automated Software Engineering
On the dataset shift problem in software engineering prediction models
Empirical Software Engineering
Regularities in learning defect predictors
PROFES'10 Proceedings of the 11th international conference on Product-Focused Software Process Improvement
Local vs. global models for effort estimation and defect prediction
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison
Empirical Software Engineering
Privacy and utility for defect prediction: experiments with MORPH
Proceedings of the 34th International Conference on Software Engineering
Defect, defect, defect: defect prediction 2.0
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Web effort estimation: the value of cross-company data set compared to single-company data set
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Size doesn't matter?: on the value of software size features for effort estimation
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Software mining and fault prediction
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Dione: an integrated measurement and defect prediction solution
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Recalling the "imprecision" of cross-project defect prediction
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Predicting aging-related bugs using software complexity metrics
Performance Evaluation
Empirical evaluation of the effects of mixed project data on learning defect predictors
Information and Software Technology
Proceedings of the 2013 International Conference on Software Engineering
How, and why, process metrics are better
Proceedings of the 2013 International Conference on Software Engineering
Predicting bug-fixing time: an empirical study of commercial software projects
Proceedings of the 2013 International Conference on Software Engineering
Data science for software engineering
Proceedings of the 2013 International Conference on Software Engineering
Better cross company defect prediction
Proceedings of the 10th Working Conference on Mining Software Repositories
Training data selection for cross-project defect prediction
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Building a second opinion: learning cross-company data
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Organizational social structures for software engineering
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
We propose a practical defect prediction approach for companies that do not track defect related data. Specifically, we investigate the applicability of cross-company (CC) data for building localized defect predictors using static code features. Firstly, we analyze the conditions, where CC data can be used as is. These conditions turn out to be quite few. Then we apply principles of analogy-based learning (i.e. nearest neighbor (NN) filtering) to CC data, in order to fine tune these models for localization. We compare the performance of these models with that of defect predictors learned from within-company (WC) data. As expected, we observe that defect predictors learned from WC data outperform the ones learned from CC data. However, our analyses also yield defect predictors learned from NN-filtered CC data, with performance close to, but still not better than, WC data. Therefore, we perform a final analysis for determining the minimum number of local defect reports in order to learn WC defect predictors. We demonstrate in this paper that the minimum number of data samples required to build effective defect predictors can be quite small and can be collected quickly within a few months. Hence, for companies with no local defect data, we recommend a two-phase approach that allows them to employ the defect prediction process instantaneously. In phase one, companies should use NN-filtered CC data to initiate the defect prediction process and simultaneously start collecting WC (local) data. Once enough WC data is collected (i.e. after a few months), organizations should switch to phase two and use predictors learned from WC data.