Supervised feature selection by clustering using conditional mutual information-based distances

Authors:
José Martínez Sotoca;Filiberto Pla
Affiliations:
Institute of New Imaging Technologies, Dept. Llenguatges i Sistemes Informátics, Universitat Jaume I, Campus de Riu Sec, 12071 Castellón, Spain;Institute of New Imaging Technologies, Dept. Llenguatges i Sistemes Informátics, Universitat Jaume I, Campus de Riu Sec, 12071 Castellón, Spain
Venue:
Pattern Recognition
Year:
2010

Citing 19
Cited 17

Elements of information theory

Elements of information theory
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Input Feature Selection by Mutual Information Based on Parzen Window

IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to variable and feature selection

The Journal of Machine Learning Research
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)

A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research
Divergence measures based on the Shannon entropy

IEEE Transactions on Information Theory
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
Input feature selection for classification problems

IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks
Using mutual information for selecting features in supervised neural net learning

IEEE Transactions on Neural Networks

A linear discriminant analysis method based on mutual information maximization

Pattern Recognition
Estimating redundancy information of selected features in multi-dimensional pattern classification

Pattern Recognition Letters
Feature selection in regression tasks using conditional mutual information

IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
A segmentation method using Otsu and fuzzy k-Means for stereovision matching in hemispherical images from forest environments

Applied Soft Computing
Feature selection using hierarchical feature clustering

Proceedings of the 20th ACM international conference on Information and knowledge management
A feature selection method based on kernel canonical correlation analysis and the minimum Redundancy-Maximum Relevance filter method

Expert Systems with Applications: An International Journal
Feature subset selection with cumulate conditional mutual information minimization

Expert Systems with Applications: An International Journal
An unsupervised feature selection framework based on clustering

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Nearest neighbor estimate of conditional mutual information in feature selection

Expert Systems with Applications: An International Journal
Divergence-based feature selection for separate classes

Neurocomputing
Comments on supervised feature selection by clustering using conditional mutual information-based distances

Pattern Recognition
Recognition of word collocation habits using frequency rank ratio and inter-term intimacy

Expert Systems with Applications: An International Journal
Feature selection techniques with class separability for multivariate time series

Neurocomputing
Large Margin Subspace Learning for feature selection

Pattern Recognition
A novel feature selection method and its application

Journal of Intelligent Information Systems
A classification system based on a new wrapper feature selection algorithm for the diagnosis of primary and secondary polycythemia

Computers in Biology and Medicine
Time-efficient estimation of conditional mutual information for variable selection in classification

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper, a supervised feature selection approach is presented, which is based on metric applied on continuous and discrete data representations. This method builds a dissimilarity space using information theoretic measures, in particular conditional mutual information between features with respect to a relevant variable that represents the class labels. Applying a hierarchical clustering, the algorithm searches for a compression of the information contained in the original set of features. The proposed technique is compared with other state of art methods also based on information measures. Eventually, several experiments are presented to show the effectiveness of the features selected from the point of view of classification accuracy.