Information enhancement for data mining

Authors:
Shichao Zhang
Affiliations:
Department of Computer Science, Zhejiang Normal University, PR China and State Key Laboratory for Novel Software Technology, Nanjing University, PR China
Venue:
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Year:
2011

Citing 27
Cited 0

Unknown attribute values in induction

Proceedings of the sixth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Mixture models for learning from incomplete data

Computational learning theory and natural learning systems: Volume IV
Correcting Noisy Data

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Mining Imperfect Data: Dealing with Contamination and Incomplete Records

Mining Imperfect Data: Dealing with Contamination and Incomplete Records
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
An efficient star acquisition method based on SVM with mixtures of kernels

Pattern Recognition Letters
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees

IEEE Transactions on Knowledge and Data Engineering
Incorporating an EM-Approach for Handling Missing Attribute-Values in Decision Tree Induction

HIS '05 Proceedings of the Fifth International Conference on Hybrid Intelligent Systems
Bridging Local and Global Data Cleansing: Identifying Class Noise in Large, Distributed Data Datasets

Data Mining and Knowledge Discovery
The problem of disguised missing data

ACM SIGKDD Explorations Newsletter
Class Noise and Supervised Learning in Medical Domains: The Effect of Feature Extraction

CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
The pairwise attribute noise detection algorithm

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Semi-parametric optimization for missing data imputation

Applied Intelligence
Cleaning disguised missing data: a heuristic approach

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Handling Missing Values when Applying Classification Models

The Journal of Machine Learning Research
DiMaC: a system for cleaning disguised missing data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Learning from incomplete data with infinite imputations

Proceedings of the 25th international conference on Machine learning
Class Noise Mitigation Through Instance Weighting

ECML '07 Proceedings of the 18th European conference on Machine Learning
Fuzzy logic supported sketch based image information enhancement

International Journal of Advanced Intelligence Paradigms
POP algorithm: Kernel-based imputation to treat missing values in knowledge discovery from databases

Expert Systems with Applications: An International Journal
NIIA: Nonparametric Iterative Imputation Algorithm

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Cleansing Noisy Data Streams

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Data Mining and Knowledge Discovery
Modern Applied Statistics with S

Modern Applied Statistics with S
Shell-neighbor method and its application in missing data imputation

Applied Intelligence
A Novel Framework for Imputation of Missing Values in Databases

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information enhancement techniques are desired in many areas such as data mining, machine learning, business intelligence, and web data analysis. Information enhancement mainly includes the following topics: data cleaning, data preparation and transformation, missing values imputation, feature and instance selection, feature construction, treatment of noisy and inconsistent data, data integration, data collection and housing, information enhancement, web data availability, web data capture and representation, and the others. It is impossible to outline all the research topics in a single paper. In this study, we discuss the information enhancement for data mining with existing missing data imputation techniques. We first review the current research on imputing missing values, and then experimentally evaluate the techniques and demonstrate the efficiency of missing data imputation techniques to enhance information in the process of pattern discovery from datasets with missing values. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 284–295 DOI: 10.1002/widm.21