Mining incomplete survey data through classification

Authors:
Hai Wang;Shouhong Wang
Affiliations:
Saint Mary’s University, Sobey School of Business, B3H 2W3, Halifax, NS, Canada;University of Massachusetts Dartmouth, Charlton College of Business, 02747-2300, Dartmouth, MA, USA
Venue:
Knowledge and Information Systems
Year:
2010

Citing 0
Cited 2

An analysis on the use of pre-processing methods in evolutionary fuzzy systems for subgroup discovery

Expert Systems with Applications: An International Journal
Data summarization ontology-based query processing

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining with incomplete survey data is an immature subject area. Mining a database with incomplete data, the patterns of missing data as well as the potential implication of these missing data constitute valuable knowledge. This paper presents the conceptual foundations of data mining with incomplete data through classification which is relevant to a specific decision making problem. The proposed technique generally supposes that incomplete data and complete data may come from different sub-populations. The major objective of the proposed technique is to detect the interesting patterns of data missing behavior that are relevant to a specific decision making, instead of estimation of individual missing value. Using this technique, a set of complete data is used to acquire a near-optimal classifier. This classifier provides the prediction reference information for analyzing the incomplete data. The data missing behavior concealed in the missing data is then revealed. Using a real-world survey data set, the paper demonstrates the usefulness of this technique.