Applied multivariate statistical analysis
Applied multivariate statistical analysis
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Predicting Fault-Prone Software Modules in Telephone Switches
IEEE Transactions on Software Engineering
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
A robust and scalable clustering algorithm for mixed type attributes in large database environment
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Software Metrics: A Rigorous and Practical Approach
Software Metrics: A Rigorous and Practical Approach
Assessing the applicability of fault-proneness models across object-oriented software projects
IEEE Transactions on Software Engineering
An empirical study of predicting software faults with case-based reasoning
Software Quality Control
Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults
IEEE Transactions on Software Engineering
An Evaluation of the Robustness of MTS for Imbalanced Data
IEEE Transactions on Knowledge and Data Engineering
On the effectiveness of early life cycle defect prediction with Bayesian Nets
Empirical Software Engineering
IEEE Transactions on Software Engineering
Multiclass MTS for Simultaneous Feature Selection and Classification
IEEE Transactions on Knowledge and Data Engineering
The Mahalanobis-Taguchi system - Neural network algorithm for data-mining in dynamic environments
Expert Systems with Applications: An International Journal
ROC Curves for Continuous Data
ROC Curves for Continuous Data
Expert Systems with Applications: An International Journal
Journal of Systems and Software
Test case selection for black-box regression testing of database applications
Information and Software Technology
Hi-index | 0.00 |
The Mahalanobis-Taguchi (MT) strategy combines mathematical and statistical concepts like Mahalanobis distance, Gram-Schmidt orthogonalization and experimental designs to support diagnosis and decision-making based on multivariate data. The primary purpose is to develop a scale to measure the degree of abnormality of cases, compared to "normal" or "healthy" cases, i.e. a continuous scale from a set of binary classified cases. An optimal subset of variables for measuring abnormality is then selected and rules for future diagnosis are defined based on them and the measurement scale. This maps well to problems in software defect prediction based on a multivariate set of software metrics and attributes. In this paper, the MT strategy combined with a cluster analysis technique for determining the most appropriate training set, is described and applied to well-known datasets in order to evaluate the fault-proneness of software modules. The measurement scale resulting from the MT strategy is evaluated using ROC curves and shows that it is a promising technique for software defect diagnosis. It compares favorably to previously evaluated methods on a number of publically available data sets. The special characteristic of the MT strategy that it quantifies the level of abnormality can also stimulate and inform discussions with engineers and managers in different defect prediction situations.