Enhancing Density-Based Data Reduction Using Entropy

Authors:
D. Huang;Tommy W. S. Chow
Affiliations:
Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong;Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong
Venue:
Neural Computation
Year:
2006

Citing 13
Cited 6

The Strength of Weak Learnability

Machine Learning
Vector quantization and signal compression

Vector quantization and signal compression
Elements of information theory

Elements of information theory
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
Data mining: concepts and techniques

Data mining: concepts and techniques
Mutual Information Theory for Adaptive Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Self-Organizing Maps

Self-Organizing Maps
Density-Based Multiscale Data Condensation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Survey of Methods for Scaling Up Inductive Algorithms

Data Mining and Knowledge Discovery
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)

FRSDE: Fast reduced set density estimator using minimal enclosing ball approximation

Pattern Recognition
A Density-Based Data Reduction Algorithm for Robust Estimators

IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part II
Concept sampling: towards systematic selection in large-scale mixed concepts in machine learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A novel template reduction approach for the K-nearest neighbor method

IEEE Transactions on Neural Networks
Noise reduction for instance-based learning with a local maximal margin approach

Journal of Intelligent Information Systems
Prototype reduction techniques: A comparison among different approaches

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data reduction algorithms determine a small data subset from a given large data set. In this article, new types of data reduction criteria, based on the concept of entropy, are first presented. These criteria can evaluate the data reduction performance in a sophisticated and comprehensive way. As a result, new data reduction procedures are developed. Using the newly introduced criteria, the proposed data reduction scheme is shown to be efficient and effective. In addition, an outlier-filtering strategy, which is computationally insignificant, is developed. In some instances, this strategy can substantially improve the performance of supervised data analysis. The proposed procedures are compared with related techniques in two types of application: density estimation and classification. Extensive comparative results are included to corroborate the contributions of the proposed algorithms.