Learning from Examples: Generation and Evaluation of Decision Trees for Software Resource Analysis
IEEE Transactions on Software Engineering - Special Issue on Artificial Intelligence in Software Applications
C4.5: programs for machine learning
C4.5: programs for machine learning
Efficient management of parallelism in object-oriented numerical software libraries
Modern software tools for scientific computing
ACM SIGSOFT Software Engineering Notes
Data Mining of Software Development Databases
Software Quality Control
Assessing the applicability of fault-proneness models across object-oriented software projects
IEEE Transactions on Software Engineering
Investigation of Logistic Regression as a Discriminant of Software Quality
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Deriving a Fault Architecture from Defect History
ISSRE '99 Proceedings of the 10th International Symposium on Software Reliability Engineering
Predicting Source Code Changes by Mining Change History
IEEE Transactions on Software Engineering
Empirical validation of object-oriented metrics for predicting fault proneness models
Software Quality Control
Hi-index | 0.00 |
This paper presents the position that software-quality modeling of open-source software for high-performance computing can identify modules that have a high risk of bugs.Given the source code for a recent release, a model can predict which modules are likely to have bugs, based on data from past releases. If a user knows which software modules correspond to functionality of interest, then risks to operations become apparent. If the risks are too great, the user may prefer not to upgrade to the most recent release.Of course, such predictions are never perfect. After release, bugs are discovered. Some bugs are missed by the model, and some predicted errors do not occur. A successful model will be accurate enough for informed management action at the time of the predictions.As evidence for this position, this paper summarizes a case study of the Portable Extensible Toolkit for Scientific Computation (PETSC), which is a mathematical library for high-performance computing. Data was drawn from source-code and configuration management logs. The accuracy of logistic-regression and decision-tree models indicated that the methodology is promising. The case study also illustrated several modeling issues.