Handling incomplete categorical data for supervised learning

Authors:
Been-Chian Chien;Cheng-Feng Lu;Steen J. Hsu
Affiliations:
Department of Computer Science and Information Engineering, National University of Tainan, Tainan, Taiwan, R.O.C.;Department of Information Engineering, I-Shou University, Kaohsiung, Taiwan, R.O.C.;Department of Information Management, Ming Hsin University of Science and Technology, Hsin-Chu, Taiwan, R.O.C.
Venue:
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Year:
2006

Citing 12
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Rough membership functions

Advances in the Dempster-Shafer theory of evidence
Rough set approach to incomplete information systems

Information Sciences: an International Journal
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques

Data mining: concepts and techniques
Handling Various Types of Uncertainty in the Rough Set Approach

RSKD '93 Proceedings of the International Workshop on Rough Sets and Knowledge Discovery: Rough Sets, Fuzzy Sets and Knowledge Discovery
A Comparison of Several Approaches to Missing Attribute Values in Data Mining

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
On the Extension of Rough Sets under Incomplete Information

RSFDGrC '99 Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing
On the Unknown Attribute Values in Learning from Examples

ISMIS '91 Proceedings of the 6th International Symposium on Methodologies for Intelligent Systems
A Recursive Partitioning Decision Rule for Nonparametric Classification

IEEE Transactions on Computers
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Fuzzy c-means clustering of incomplete data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification is an important research topic in knowledge discovery. Most of the researches on classification concern that a complete dataset is given as a training dataset and the test data contain all values of attributes without missing. Unfortunately, incomplete data usually exist in real-world applications. In this paper, we propose new handling schemes of learning classification models from incomplete categorical data. Three methods based on rough set theory are developed and discussed for handling incomplete training data. The experiments were made and the results were compared with previous methods making use of a few famous classification models to evaluate the performance of the proposed handling schemes.