Extracting classification rule of software diagnosis using modified MEPA

Authors:
Jr-Shian Chen;Ching-Hsue Cheng
Affiliations:
Department of Information Management, National Yunlin University of Science and Technology, 123, Section 3, University Road, Touliu, Yunlin 640, Taiwan and Department of Computer Science and Infor ...;Department of Information Management, National Yunlin University of Science and Technology, 123, Section 3, University Road, Touliu, Yunlin 640, Taiwan
Venue:
Expert Systems with Applications: An International Journal
Year:
2008

Citing 11
Cited 5

The KDD process for extracting useful knowledge from volumes of data

Communications of the ACM
Software Metrics: A Rigorous Approach

Software Metrics: A Rigorous Approach
Data Mining: Introductory and Advanced Topics

Data Mining: Introductory and Advanced Topics
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
Induction of Decision Trees

Machine Learning
Potter's Wheel: An Interactive Data Cleaning System

Proceedings of the 27th International Conference on Very Large Data Bases
Knowledge Discovery in Databases: An Attribute-Oriented Approach

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Genetic Programming-Based Decision Trees for Software Quality Classification

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Detecting noisy instances with the rule-based classification model

Intelligent Data Analysis
Data mining in soft computing framework: a survey

IEEE Transactions on Neural Networks

Accuracy and efficiency comparisons of single- and multi-cycled software classification models

Information and Software Technology
DIAGNOSING CARDIOVASCULAR DISEASE USING AN ENHANCED ROUGH SETS APPROACH

Applied Artificial Intelligence
A hybrid model based on rough sets theory and genetic algorithms for stock price forecasting

Information Sciences: an International Journal
Application of decision tree based on C4.5 in analysis of coal logistics customer

IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
A soft-computing based rough sets classifier for classifying IPO returns in the financial markets

Applied Soft Computing

Quantified Score

Hi-index	12.05

Visualization

Abstract

Defective software modules cause software failures, increase development and maintenance costs, and reduce customer satisfaction. Effective defect prediction models can help developers focus quality assurance activities on defect-prone modules and thus improve software quality by using resources more efficiently. In real-world databases are highly susceptible to noisy, missing, and inconsistent data. Noise is a random error or variance in a measured variable [Han, J., & Kamber, M. (2001). Data Mining: Concepts and Techniques, San Francisco: Morgan Kaufmann Publishers]. When decision trees are built, many of the branches may reflect noisy or outlier data. Therefore, data preprocessing steps are very important. There are many methods for data preprocessing. Concept hierarchies are a form of data discretization that can use for data preprocessing. Data discretization has many advantages, such as data can be reduced and simplified. Using discrete features are usually more compact, shorter and more accurate than using continuous ones [Liu, H., Hussain, F., Tan, C.L., & Dash, M. (2002). Discretization: An enabling technique. Data Mining and Knowledge Discovery, 6(4), 393-423]. In this paper, we propose a modified minimize entropy principle approach and develop a modified MEPA system to partition the data, and then build the classification tree model. For verification, two NASA software projects KC2 and JM1 are applied to illustrate our proposed method. We establish a prototype system to discrete data from these projects. The error rate and number of rules show that the proposed approach is both better than other methods.