Analyzing Software Measurement Data with Clustering Techniques

Authors:
Shi Zhong;Taghi M. Khoshgoftaar;Naeem Seliya
Affiliations:
-;-;-
Venue:
IEEE Intelligent Systems
Year:
2004

Citing 6
Cited 22

Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Assessing the applicability of fault-proneness models across object-oriented software projects

IEEE Transactions on Software Engineering
Correcting Noisy Data

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Analogy-Based Practical Classification Rules for Software Quality Estimation

Empirical Software Engineering
A unified framework for model-based clustering

The Journal of Machine Learning Research
Detecting outliers using rule-based modeling for improving CBR-based software quality classification models

ICCBR'03 Proceedings of the 5th international conference on Case-based reasoning: Research and Development

Mining source code elements for comprehending object-oriented systems and evaluating their maintainability

ACM SIGKDD Explorations Newsletter
Determining noisy instances relative to attributes of interest

Intelligent Data Analysis
The pairwise attribute noise detection algorithm

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Detecting noisy instances with the rule-based classification model

Intelligent Data Analysis
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm

Intelligent Data Analysis
Statistical models vs. expert estimation for fault prediction in modified code - an industrial case study

Journal of Systems and Software
Regression via Classification applied on software defect estimation

Expert Systems with Applications: An International Journal
An investigation of artificial neural networks based prediction systems in software project management

Journal of Systems and Software
Mining software repositories for comprehensible software fault prediction models

Journal of Systems and Software
Noise elimination with partitioning filter for software quality estimation

International Journal of Computer Applications in Technology
Integrating in-process software defect prediction with association mining to discover defect pattern

Information and Software Technology
Imputation techniques for multivariate missingness in software measurement data

Software Quality Control
Misclassification cost-sensitive fault prediction models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
An analysis of clustered failures on large supercomputing systems

Journal of Parallel and Distributed Computing
Class noise detection using frequent itemsets

Intelligent Data Analysis
Thresholds based outlier detection approach for mining class outliers: An empirical case study on software measurement datasets

Expert Systems with Applications: An International Journal
Application of K-Medoids with Kd-Tree for Software Fault Prediction

ACM SIGSOFT Software Engineering Notes
A bayesian network based approach for software defects prediction

ACM SIGSOFT Software Engineering Notes
An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining

Information Sciences: an International Journal
Clustering methodologies for software engineering

Advances in Software Engineering
Tackling the problem of classification with noisy data using Multiple Classifier Systems: Analysis of the performance and robustness

Information Sciences: an International Journal
Incomplete-case nearest neighbor imputation in software measurement data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software engineers often construct quality-estimation models, used to predict the fault-proneness of software modules, by training a classifier from labeled software metrics data. They often encounter two challenges: noisy data and a lack of fault-proneness labels in real-world projects. You can't train a classifier without fault-proneness labels. The clustering exploratory analysis method addresses these two challenges and uses clustering algorithms with the help of a software engineering expert. This method is unsupervised because it doesn't require labeled training data to predict software modules' fault-proneness. Two real-world case studies verify this clustering- and expert-based approach's effectiveness in predicting both software modules' fault-proneness and potentially noisy modules.