Dealing with missing data: algorithms based on fuzzy set and rough set theories

Authors:
Dan Li;Jitender Deogun;William Spaulding;Bill Shuart
Affiliations:
Department of Computer Science & Engineering, University of Nebraska-Lincoln, Lincoln, NE;Department of Computer Science & Engineering, University of Nebraska-Lincoln, Lincoln, NE;Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE;Department of Psychology, University of Nebraska-Lincoln, Lincoln, NE
Venue:
Transactions on Rough Sets IV
Year:
2005

Citing 13
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Decision-Rule Solutions for Data Mining with Missing Values

IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Cluster-Based Algorithms for Dealing with Missing Values

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Generalized Distance Functions

SMI '99 Proceedings of the International Conference on Shape Modeling and Applications
Interpolation models for spatiotemporal association mining

Fundamenta Informaticae - Special issue on the 9th international conference on rough sets, fuzzy sets, data mining and granular computing (RSFDGrC 2003)
Efficient rule discovery in a geo-spatial decision support system

dg.o '02 Proceedings of the 2002 annual national conference on Digital government research
Comparison of conventional and rough K-means clustering

RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing
Interpolation techniques for geo-spatial association rule mining

RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing
A rough set approach to data with missing attribute values

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Using fuzzy methods to model nearest neighbor rules

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Low-complexity fuzzy relational clustering algorithms for Web mining

IEEE Transactions on Fuzzy Systems
Rough fuzzy MLP: knowledge encoding and classification

IEEE Transactions on Neural Networks

A review and comparison of strategies for handling missing values in separate-and-conquer rule learning

Journal of Intelligent Information Systems
Mining incomplete data: a rough set approach

RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Missing data, commonly encountered in many fields of study, introduce inaccuracy in the analysis and evaluation. Previous methods used for handling missing data (e.g., deleting cases with incomplete information, or substituting the missing values with estimated mean scores), though simple to implement, are problematic because these methods may result in biased data models. Fortunately, recent advances in theoretical and computational statistics have led to more flexible techniques to deal with the missing data problem. In this paper, we present missing data imputation methods based on clustering, one of the most popular techniques in Knowledge Discovery in Databases (KDD). We combine clustering with soft computing, which tends to be more tolerant of imprecision and uncertainty, and apply fuzzy and rough clustering algorithms to deal with incomplete data. The experiments show that a hybridization of fuzzy set and rough set theories in missing data imputation algorithms leads to the best performance among our four algorithms, i.e., crisp K-means, fuzzy K-means, rough K-means, and rough-fuzzy K-means imputation algorithms.