Advances in software inspections
IEEE Transactions on Software Engineering
C4.5: programs for machine learning
C4.5: programs for machine learning
Case-based reasoning
Machine Learning
Software metrics (2nd ed.): a rigorous and practical approach
Software metrics (2nd ed.): a rigorous and practical approach
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
A Critique of Software Defect Prediction Models
IEEE Transactions on Software Engineering
Elements of Software Science (Operating and programming systems series)
Elements of Software Science (Operating and programming systems series)
Machine Learning
Machine Learning
What We Have Learned About Fighting Defects
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
SEW '02 Proceedings of the 27th Annual NASA Goddard Software Engineering Workshop (SEW-27'02)
Machine Learning for Software Engineering: Case Studies in Software Reuse
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Learning Early Lifecycle IV&V Quality Indicators
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
An introduction to variable and feature selection
The Journal of Machine Learning Research
Not So Naive Bayes: Aggregating One-Dependence Estimators
Machine Learning
Feature subset selection can improve software cost estimation accuracy
PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Data Mining
Learning the best subset of local features for face recognition
Pattern Recognition
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Make the Most of Your Time: How Should the Analyst Work with Automated Traceability Tools?
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
The Effects of Over and Under Sampling on Fault-prone Module Detection
ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
IEEE Transactions on Software Engineering
A Multivariate Analysis of Static Code Attributes for Defect Prediction
QSIC '07 Proceedings of the Seventh International Conference on Quality Software
Fault Prediction using Early Lifecycle Data
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
How good is your blind spot sampling policy
HASE'04 Proceedings of the Eighth IEEE international conference on High assurance systems engineering
Comparing design and code metrics for software quality prediction
Proceedings of the 4th international workshop on Predictor models in software engineering
Can we build software faster and better and cheaper?
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Validation of network measures as indicators of defective modules in software systems
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Merits of using repository metrics in defect prediction for open source projects
FLOSS '09 Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development
On the relative value of cross-company and within-company data for defect prediction
Empirical Software Engineering
An FIS for early detection of defect prone modules
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Using traits of web macro scripts to predict reuse
Journal of Visual Languages and Computing
Information and Software Technology
On the value of learning from defect dense components for software defect prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Programmer-based fault prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm
Expert Systems with Applications: An International Journal
Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics
Defect prediction using social network analysis on issue repositories
Proceedings of the 2011 International Conference on Software and Systems Process
Advances in Engineering Software
Does measuring code change improve fault prediction?
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Are change metrics good predictors for an evolving software product line?
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Transfer learning for cross-company software defect prediction
Information and Software Technology
An investigation on the feasibility of cross-project defect prediction
Automated Software Engineering
On the dataset shift problem in software engineering prediction models
Empirical Software Engineering
Regularities in learning defect predictors
PROFES'10 Proceedings of the 11th international conference on Product-Focused Software Process Improvement
Software mining and fault prediction
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Predicting method crashes with bytecode operations
Proceedings of the 6th India Software Engineering Conference
Influence of confirmation biases of developers on software quality: an empirical study
Software Quality Control
Empirical evaluation of the effects of mixed project data on learning defect predictors
Information and Software Technology
Information and Software Technology
Hi-index | 0.00 |
Context: There are many methods that input static code features and output a predictor for faulty code modules. These data mining methods have hit a "performance ceiling"; i.e., some inherent upper bound on the amount of information offered by, say, static code features when identifying modules which contain faults. Objective: We seek an explanation for this ceiling effect. Perhaps static code features have "limited information content"; i.e. their information can be quickly and completely discovered by even simple learners. Method:An initial literature review documents the ceiling effect in other work. Next, using three sub-sampling techniques (under-, over-, and micro-sampling), we look for the lower useful bound on the number of training instances. Results: Using micro-sampling, we find that as few as 50 instances yield as much information as larger training sets. Conclusions: We have found much evidence for the limited information hypothesis. Further progress in learning defect predictors may not come from better algorithms. Rather, we need to be improving the information content of the training data, perhaps with case-based reasoning methods.