Applied multivariate statistical analysis
Applied multivariate statistical analysis
IEEE Transactions on Software Engineering - Special issue on software reliability
Case-based reasoning
Experimental software engineering: a report on the state of the art
Proceedings of the 17th international conference on Software engineering
A Validation of Object-Oriented Design Metrics as Quality Indicators
IEEE Transactions on Software Engineering
Globally Optimal Fuzzy Decision Trees for Classification and Regression
IEEE Transactions on Pattern Analysis and Machine Intelligence
Comparing Software Prediction Techniques Using Simulation
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Case-Based Reasoning: Experiences, Lessons and Future Directions
Case-Based Reasoning: Experiences, Lessons and Future Directions
Controlling Overfitting in Classification-Tree Models ofSoftware Quality
Empirical Software Engineering
Data Mining and Knowledge Discovery: Making Sense Out of Data
IEEE Expert: Intelligent Systems and Their Applications
Assessing the applicability of fault-proneness models across object-oriented software projects
IEEE Transactions on Software Engineering
Predicting Fault-Proneness using OO Metrics: An Industrial Case Study
CSMR '02 Proceedings of the 6th European Conference on Software Maintenance and Reengineering
Investigation of Logistic Regression as a Discriminant of Software Quality
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Experience from Replicating Empirical Studies on Prediction Models
METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Software Quality Classification Modeling Using The SPRINT Decision Tree Algorithm
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Building Software Quality Classification Trees: Approach, Experimentation, Evaluation
ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Application of multivariate analysis for software fault prediction
Software Quality Control
IEEE Transactions on Neural Networks
Resource-oriented software quality classification models
Journal of Systems and Software
Determining noisy instances relative to attributes of interest
Intelligent Data Analysis
Evaluating indirect and direct classification techniques for network intrusion detection
Intelligent Data Analysis
Detecting noisy instances with the rule-based classification model
Intelligent Data Analysis
Spam Filter Based Approach for Finding Fault-Prone Software Modules
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Training on errors experiment to detect fault-prone software modules by spam filter
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
An extension of fault-prone filtering using precise training and a dynamic threshold
Proceedings of the 2008 international working conference on Mining software repositories
Quantitative analysis of faults and failures with multiple releases of softpm
Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Accuracy and efficiency comparisons of single- and multi-cycled software classification models
Information and Software Technology
Imputation techniques for multivariate missingness in software measurement data
Software Quality Control
Prediction of Fault-Prone Software Modules Using a Generic Text Discriminator
IEICE - Transactions on Information and Systems
An early software-quality classification based on improved grey relational classifier
Expert Systems with Applications: An International Journal
On the relative value of cross-company and within-company data for defect prediction
Empirical Software Engineering
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Evolutionary sampling and software quality modeling of high-assurance systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Journal of Systems and Software
Aggregating performance metrics for classifier evaluation
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Fault-prone module detection using large-scale text features based on spam filtering
Empirical Software Engineering
Cost-sensitive boosting neural networks for software defect prediction
Expert Systems with Applications: An International Journal
Ensemble missing data techniques for software effort prediction
Intelligent Data Analysis
An integrated approach to detect fault-prone modules using complexity and text feature metrics
AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Organizational volatility and its effects on software defects
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Combining techniques for software quality classification: An integrated decision network approach
Expert Systems with Applications: An International Journal
Review: Software fault prediction: A literature review and current trends
Expert Systems with Applications: An International Journal
Predicting high-risk program modules by selecting the right software measurements
Software Quality Control
Modeling software component criticality using a machine learning approach
AIS'04 Proceedings of the 13th international conference on AI, Simulation, and Planning in High Autonomy Systems
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Empirical study of Software Quality estimation
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
A study of subgroup discovery approaches for defect prediction
Information and Software Technology
Incomplete-case nearest neighbor imputation in software measurement data
Information Sciences: an International Journal
Information Sciences: an International Journal
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
Automated Software Engineering
Hi-index | 0.01 |
Software metrics-based quality classification models predict a software module as either fault-prone (fp) or not fault-prone (nfp). Timely application of such models can assist in directing quality improvement efforts to modules that are likely to be fp during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. Since several classification techniques are available, a relative comparative study of some commonly used classification techniques can be useful to practitioners. We present a comprehensive evaluation of the relative performances of seven classification techniques and/or tools. These include logistic regression, case-based reasoning, classification and regression trees (CART), tree-based classification with S-PLUS, and the Sprint-Sliq, C4.5, and Treedisc algorithms. The use of expected cost of misclassification (ECM), is introduced as a singular unified measure to compare the performances of different software quality classification models. A function of the costs of the Type I (a nfp module misclassified as fp) and Type II (a fp module misclassified as nfp) misclassifications, ECM is computed for different cost ratios. Evaluating software quality classification models in the presence of varying cost ratios is important, because the usefulness of a model is dependent on the system-specific costs of misclassifications. Moreover, models should be compared and preferred for cost ratios that fall within the range of interest for the given system and project domain. Software metrics were collected from four successive releases of a large legacy telecommunications system. A two-way ANOVA randomized-complete block design modeling approach is used, in which the system release is treated as a block, while the modeling method is treated as a factor. It is observed that predictive performances of the models is significantly different across the system releases, implying that in the software engineering domain prediction models are influenced by the characteristics of the data and the system being modeled. Multiple-pairwise comparisons are performed to evaluate the relative performances of the seven models for the cost ratios of interest to the case study. In addition, the performance of the seven classification techniques is also compared with a classification based on lines of code. The comparative approach presented in this paper can also be applied to other software systems.