Correcting the Kullback-Leibler distance for feature selection

Authors:
Frans M. Coetzee
Affiliations:
GenuOne, Inc., 2 Copley Square, Boston, MA 02216, USA
Venue:
Pattern Recognition Letters
Year:
2005

Citing 7
Cited 0

Floating search methods in feature selection

Pattern Recognition Letters
Divergence Based Feature Selection for Multimodal Class Densities

IEEE Transactions on Pattern Analysis and Machine Intelligence
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Feature Selection in Web Applications By ROC Inflections and Powerset Pruning

SAINT '01 Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001)
Generation of Attribute Value Taxonomies from Data for Data-Driven Construction of Accurate and Compact Classifiers

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Patterns in pattern recognition: 1968-1974

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.10

Visualization

Abstract

A frequent practice in feature selection is to maximize the Kullback-Leibler (K-L) distance between target classes. In this note we show that this common custom is frequently suboptimal, since it fails to take into account the fact that classification occurs using a finite number of samples. In classification, the variance and higher order moments of the likelihood function should be taken into account to select feature subsets, and the Kullback-Leibler distance only relates to the mean separation. We derive appropriate expressions and show that these can lead to major increases in performance.