Elements of information theory
Elements of information theory
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Input Feature Selection by Mutual Information Based on Parzen Window
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to variable and feature selection
The Journal of Machine Learning Research
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
IEEE Transactions on Pattern Analysis and Machine Intelligence
A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)
A First Course in Information Theory (Information Technology: Transmission, Processing and Storage)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Improved use of continuous attributes in C4.5
Journal of Artificial Intelligence Research
Divergence measures based on the Shannon entropy
IEEE Transactions on Information Theory
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Input feature selection for classification problems
IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Using mutual information for selecting features in supervised neural net learning
IEEE Transactions on Neural Networks
A linear discriminant analysis method based on mutual information maximization
Pattern Recognition
Estimating redundancy information of selected features in multi-dimensional pattern classification
Pattern Recognition Letters
Feature selection in regression tasks using conditional mutual information
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Feature selection using hierarchical feature clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Expert Systems with Applications: An International Journal
Feature subset selection with cumulate conditional mutual information minimization
Expert Systems with Applications: An International Journal
An unsupervised feature selection framework based on clustering
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Nearest neighbor estimate of conditional mutual information in feature selection
Expert Systems with Applications: An International Journal
Divergence-based feature selection for separate classes
Neurocomputing
Recognition of word collocation habits using frequency rank ratio and inter-term intimacy
Expert Systems with Applications: An International Journal
Large Margin Subspace Learning for feature selection
Pattern Recognition
A novel feature selection method and its application
Journal of Intelligent Information Systems
Computers in Biology and Medicine
Time-efficient estimation of conditional mutual information for variable selection in classification
Computational Statistics & Data Analysis
Hi-index | 0.01 |
In this paper, a supervised feature selection approach is presented, which is based on metric applied on continuous and discrete data representations. This method builds a dissimilarity space using information theoretic measures, in particular conditional mutual information between features with respect to a relevant variable that represents the class labels. Applying a hierarchical clustering, the algorithm searches for a compression of the information contained in the original set of features. The proposed technique is compared with other state of art methods also based on information measures. Eventually, several experiments are presented to show the effectiveness of the features selected from the point of view of classification accuracy.